Method and probes for the identification of microbial genes specifically induced during host infection

ABSTRACT

The present invention relates to a class of microbial coding sequences the transcription or cotranscription of which is specifically induced during microbial infection of a host. These particular coding sequences or defined regions thereof may be used as probes to identify and isolate microbial virulence genes. The products of these virulence genes will provide potential targets for the development of vaccines or antimicrobial agents.

DESCRIPTION

This invention was made with Government support under Grant No. AI 36373awarded by the National Institute of Health. The government has certainrights in this invention.

TECHNICAL FIELD

The current invention relates to a class of microbial coding sequencesthat are specifically induced during infection of a host by a microbialpathogen and more particularly to a set of probes that may be used toidentify and isolate microbial virulence genes. The products of thesevirulence genes will provide potential targets for the development ofvaccines or antimicrobial agents.

BACKGROUND ART

Microbial pathogens, or disease-producing microorganisms, can infect ahost by one of several mechanisms. For example, they may enter through abreak in the skin, they may be introduced by vector transmission, orthey may interact with a mucosal surface. Disease ensues followinginfection of a host, when the potential of the pathogen to disruptnormal bodily functions is fully expressed. Each disease-producingmicroorganism possesses a collection of virulence factors, that enhancetheir pathogenicity and allow them to invade host or human tissues anddisrupt normal bodily functions. Infectious diseases have been majorkillers over the last several thousand years, and while vaccines andantimicrobial agents have played an important role in the dramaticdecrease in the incidence of infectious diseases, infectious diseasesare still the number one cause of death world-wide.

Vaccines

Attempts to vaccinate are almost as old as man's attempt to rid himselfof disease. However, during the last 200 years, since the time EdwardJenner deliberately and systematically inoculated a population withcowpox to avoid a smallpox epidemic, vaccination, at least in parts ofthe world, has controlled the following nine major diseases: smallpox,diphtheria, tetanus, yellow fever, pertussis, poliomyelitis, measles,mumps and rubella. In the case of smallpox, the disease has been totallyeradicated. The impact of vaccination on the health of the world'spopulation is hard to exaggerate. With the exception of safer water, noother modality, not even antibiotics, has had such a major effect onmortality reduction and population growth.

Following the first exposure of a host to an antigen, the immuneresponse is often slow to yield antibody and the amount of antibodyproduced is small, i.e., the primary response. Upon secondary challengewith the same antigen the response is more rapid and of greatermagnitude, i.e., the secondary response. Achieving an immune state equalto the accelerated secondary response following reinfection with apathogenic microorganism is the goal that is sought to be induced byvaccines. Vaccines are basically suspensions of viral, bacterial, orother pathogenic agents or their antigens which can be administeredprophylactically to induce immunity.

In general, active vaccines can be divided into two general classes:subunit vaccines and whole organism vaccines. Subunit vaccines areprepared from components of the whole organism and are usually developedin order to avoid the use of live organisms that may cause disease, orto avoid the toxic components present in whole organism vaccines.

The use of purified capsular polysaccharide material of H. influenzatype b as a vaccine against the meningitis caused by this organism inhumans is an example of a vaccine based upon an antigenic component. SeeParks et al., J. Inf. Dis., 136 (Suppl.):551 (1977), Anderson et al., J.Inf. Dis., 136 (Suppl.):563 (1977); and Mäkela et al., J. Inf. Dis., 136(Suppl.):543 (1977). Classically, subunit vaccines have been prepared bychemical inactivation of partially purified toxins, and hence have beencalled toxoids. Formaldehyde or glutaraldehyde have been the chemicalsof choice to detoxify bacterial toxins. Both diphtheria and tetanustoxins have been successfully inactivated with formaldehyde resulting ina safe and effective toxoid vaccine which has been used for over 40years to control diphtheria and tetanus. See, Pappenheimer, A. M.,Diphtheria. In: Bacterial Vaccines (R. Germanier, ed.), Academic Press,Orlando, Fla., pp. 1-36 (1984); Bizzini, B., Tetanus. Id. at 37-68. Incontrast to subunit vaccines, whole organism vaccines make use of theentire organism for vaccination. The organism may be used killed oralive (usually attenuated) depending upon the requirements necessary toelicit protective immunity. The following discussion will focus on livebut attenuated microorganisms (live vaccines).

In the case of intracellular pathogens, it is generally agreed that livevaccines induce a highly effective type of immune response. Ideally,these attenuated microorganisms maintain the full integrity ofcell-surface constituents necessary for specific antibody induction yetare unable to cause disease, because they fail to produce virulencefactors, grow too slowly, or do not grow at all in the host.Additionally, these attenuated strains should have no probability ofreverting to a virulent wild-type strain. Traditionally, live vaccineshave been obtained by either isolating an antigenically related virusfrom another species, by selecting attenuation through passage andadaptation in a nontargeted species or in tissue cultures, or byselection of temperature-sensitive variants.

In contrast to these somewhat haphazard approaches of selecting for livevaccines, modern developmental approaches introduce specific mutationsinto the genome of the pathogen which affect the ability of thatpathogen to induce disease, that is, specific mutations are introducedinto genes involved in virulence. Defined genetic manipulation is thecurrent approach being taken in an attempt to develop live vaccines forvarious diseases caused by pathogenic microorganisms. U.S. Pat. No.5,210,035, exemplifies this approach by describing the construction ofvaccine strains from pathogenic microorganisms made non-virulent by theintroduction of complete and non-reverting mutational blocks in thebiosynthesis pathways, causing a requirement for metabolites notavailable in host tissues. Specifically, Stocker teaches that S. typhimay be attenuated by interrupting the pathway for biosynthesis ofaromatic (aro) metabolites which renders Salmonella auxotrophic (i.e.,nutritionally dependent) for p-aminobenzoic acid (PABA) and2,3-dihydroxybenzoate, substances not available to bacteria in mammaliantissue. These aro⁻ mutants are unable to synthesize chorismic acid (aprecursor of the aromatic compounds PABA and 2,3-dihydroxybenzoate), andno other pathways in Salmonella exist that can overcome this deficiency.As a consequence of this auxotrophy, the aro⁻ deleted bacteria are notcapable of extensive proliferation within the host; however, they resideand grow intracellularly long enough to stimulate protective immuneresponses.

Unfortunately the development of vaccines based on chemical toxoids,discussed previously, is difficult since protective antigens and thegenes encoding them must first be identified and then procedures must bedeveloped to efficiently isolate the antigens. Similarly, modemapproaches to the rational development of live vaccines has beenhampered by the limited knowledge available concerning genes that areinvolved in virulence and thus the targets of mutagenesis.

Antimicrobial Agents

The medical literature up to about 1930 is full of vivid descriptions ofgruesome infections by streptococci, staphylococci, and clostridia. Thedawning of the age of antimicrobial therapy, with the introduction ofthe sulfonamides in the 1930s, allowed physicians finally to cure manyof these fatal infections. From the outset, antibiotics were heralded asa panacea for everything from fungus-infected pear orchards to thecommon cold. Penicillin lozenges were popular as were nostrums such asantibiotic mouthwashes and throat sprays. By the 1950s, doctorsjubilantly predicted an end to infectious diseases and, by the 1980s,half of all drug companies had stopped developing antibiotics, believingthe battle won.

The stunning success of the pharmaceutical industry in the United Sates,Japan, the United Kingdom, France, and Germany in creating newantibiotics over the past three decades have caused society to becomecomplacent about the potential of bacterial resistance, but what oncewas a situation where antibiotic controls prevailed has sincedeteriorated badly. C. T. Walsh, in a technical paper entitled“Vancomycin Resistance: Decoding the Molecular Logic,” Science,261:308-309 (1993) stated that “[t]he 1990s may come to be remembered asa decade in which infectious diseases made a dramatic worldwideresurgence, largely because of the appearance of antibiotic-resistantmicrobes.”

In economic terms alone, such antibiotic resistance is costly. A recentestimate is that the extra expense of treating multiresistant infectionsis $100 to $200 million annually in the United States, see A. Gibbons,Science, 257:1036-1038 (1992). But economic impact reflects only part ofthe true costs of dealing with antibiotic resistant infections. Morethan 13,000 Americans are dying each year from drug resistant bacteriaand doctors warn that the problem is steadily worsening. The FDAconsiders bacterial drug resistance threatening enough that it isplanning incentives to encourage development of new antibiotics.

To date, the vast majority of antibiotics in the marketplace werederived from large-scale screens or from analog development programs.Classification of antibiotics by mechanisms of action appears below inTable 1.

TABLE 1 Mechanisms of action Agent Inhibition of synthesis or damage tocell wall Penicillins Glycopeptides Cephalosporins MonobactamsInhibition of synthesis or damage to cytoplasmic Polymyxins membranePolyene antifungals Inhibition of synthesis or metabolism of nucleicQuinolones acids Rifampin Nitrofurantoins Protein biosynthesisTetracyclines Chloramphenicol Macrolides Lincosamides AminoglycosidesModification of energy metabolism Sulfonamides Trimethoprim Dapsone

As is shown in Table 1, there are very few mechanisms of action that areexploited by current antibiotics. Unfortunately, to date the majority ofantimicrobial agents have been randomly discovered. Robotic systems canperform thousands of tests per day by means of radioactive labeling orspectroscopic detection making it feasible to scan 100,000 to 500,000compounds in a year. While the efforts are still in their early stages,some companies are beginning to use. “rational drug design” to designnew drugs that can use selective mechanisms to destroy a specificmicrobe. Understanding the biological or biochemical mechanism of adisease often suggests the types of molecules needed for new drugs.Consequently, not knowing what makes infectious diseases virulent in thefirst place, is a fundamental fact which has severely limited thecontinued development of vaccines and antibiotics. A method ofidentifying genes that are expressed by microbial pathogens infecting ahost has been developed: in vivo expression technology (IVET).

In Vivo Expression Technology

Essentially, the IVET selection strategy disclosed in U.S. Pat. No.5,434,065, and herein incorporated by reference originates with amicrobial strain carrying a mutation in a biosynthetic gene that highlyattenuates its growth in a given host. Next, growth of the mutant strainin the host is complemented by transcriptional fusions to the samebiosynthetic gene. Although, in theory, many different biosyntheticgenes (e.g., aroA, thyA, asd) could be used in this selection scheme,initial efforts have focused on the purA gene of Salmonella typhimurium,purA mutants are highly attenuated in their ability to cause mousetyphoid and to persist in host tissues. This purA requirement provides abasis for the positive selection of microbial virulence genes that arespecifically induced in a given host.

The first step in construction of purA operon fusions as per U.S. Pat.No. 5,434,065 was to build a pool of recombinant clones containingrandom fragments of Salmonella DNA. Partial Sau3A I restriction digestsof total S. typhimurium DNA were used to obtain the random DNAfragments, which were then cloned 5′ to an artificial operon having apromoterless purA gene fused to a promoterless lacZY gene on the vector,pIVET1. In the recombinant plasmids of interest, the fragment containeda Salmonella promoter in the proper orientation to drive the purA-lacfusion. This random pool was then introduced into a purA deletion strainof S. typhimurium that does not contain the Pi replication protein.Selection for ampicillin resistance requires the integration of therecombinant plasmids into the chromosome by homologous recombination,using the cloned Salmonella DNA as the source of homology. In the clonesof interest, the product of the integration event generates aduplication of Salmonella material in which one promoter drives thepurA-lac fusion, while the other promoter drives the expression of awild-type copy of the putative virulence gene as shown in FIG. 1. Theexpression of both of these promoters is selected in the host.Expression of the purA-lac fusion is selected to overcome the parentalpurA auxotrophy. Expression of the virulence gene is selected becausethe gene product is required for infection. The expression levels of theoperon fusions can be monitored both on laboratory media and in animaltissues by measuring the levels of β-galactosidase activity.

A large collection of recombinant plasmids that contained the purA-lactranscriptional fusions were integrated into the chromosome of a purAdeletion strain of S. typhimurium, FIG. 1. The subsequent pool ofintegrated fusion strains was injected intraperitoneally (i.p.) into aBALB/c mouse. After a 3 day incubation, the mouse was sacrificed and thebacteria were recovered from an internal organ such as the spleen,intestine, or liver. Only those bacterial cells that contain fusions tochromosomal promoters that had sufficient transcription levels toprovide enough of the purA gene product were selected (to overcome theparental purine deficiency) by demanding the survival and propagation ofthe fusion strain in the host. Note that all genes that haveconstitutively active promoters will answer the IVET selection becausethey would produce sufficient levels of purA gene product (and LacZ) allthe time. Thus, when the mouse-selected pool was plated on MacConkeyLactose indicator medium, an increase in the percentage of Lac⁺ clonesis expected compared to the pre-selected pool. This expected shift hasbeen termed the “RED SHIFT.” To test the prediction, the percentage ofLac⁺ clones in the pre-selected and mouse-selected fusions wasdetermined by plating on MacConkey Lactose indicator medium. In thepre-selected pool, 50% of the fusions were transcriptionally active or“ON” in vitro (red or pink in colonies), whereas in the mouse-selectedpool 95% of the fusions were “ON.” This observed shift in percentage infavor of Lac⁺ clones (the RED SHIFT) suggests that the IVET systemselected for promoters that are active in vivo. Since the underlyingpremise of IVET is that some virulence genes will be expressed only whenthey are in the proper environment and not on simple laboratory media,we focused our efforts on the rare 5% Lac⁻ class of fusions that wererecovered from the spleens of infected mice. Presumably, these Lacestrains contained fusions to genes that were “ON” in the mouse (tocomplement the purA deficiency) and “OFF” out of the mouse.

While the IVET approach provides an important new way to identify genesthat are involved in virulence, some shortcomings were encountered usingthe IVET method discussed above. There is still a need, therefore, for amethod and a means for identifying and isolating microbial virulencegenes the products of which will provide a basis for rational vaccineand drug design.

DISCLOSURE OF INVENTION

Accordingly, it is an object of this invention to identify a class ofmicrobial virulence genes involved in virulence.

It is an additional object of this invention to enhance the selectivityof methods currently available to identify virulence genes.

It is a further object of this invention to provide a set of codingsequences known to be involved in pathogenesis for use as probes toidentify and isolate other microbial genes that are cotranscribed withsaid coding sequences during infection.

Additional objects, advantages and novel features of this inventionshall be set forth in part in the description that follows, and in partwill become apparent to those skilled in the art upon examination of thefollowing specification or may be learned by the practice of theinvention. The objects and advantages of the invention may be realizedand attained by means of the instrumentalities, combinations,compositions, and methods particularly pointed out in the appendedclaims.

To achieve the foregoing and other objects and in accordance with thepurposes of the present invention, as embodied and broadly describedtherein, the method and compositions of this invention comprise using aclass of coding sequences to identify genes, the transcription orcotranscription of which are induced during microbial infection of ahost.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and from a part ofthe specification, illustrate the preferred embodiments of the presentinvention, and together with the description serve to explain theprinciples of the invention.

FIG. 1 is a flow sheet representing a method of selecting genes that areinduced in a host according to the IVET methodology of U.S. Pat. No.5,434,065.

BEST MODE FOR CARRYING OUT THE INVENTION

In general and overall scope, the present invention provides a methodand means for identifying and isolating a class of microbial virulencegenes whose products will define metabolic, physiological, and geneticfactors that contribute to the virulence of microbial pathogens,providing new targets for vaccine and antimicrobial drug development. Bymodifying the IVET methodology described previously, its selectivity wasgreatly enhanced, allowing for the identification of a number of geneswhich are induced during microbial infection of a host. In turn, thesegenes or portions thereof may be used as probes to identify other genesthat are also induced during infection of a host. Consequently, themethod of this invention further relies on a set of hybridization probeswhich comprise microbial coding sequences the transcription orcotranscription of which are induced during microbial infection of ahost. These probes may be used to screen DNA libraries such as cosmid,lambda, or plasmid libraries thereby identifying and isolating genesthat are transcribed or cotranscribed in connection with the codingsequences making up the hybridization probes of the present invention.The probes of the present invention may also be sequenced and thesequence compared to published sequences, thus (i) identifying genesthat are known, but now known to be involved in virulence; or (ii)identifying genes that are unknown.

The method and probes of the present invention are based on theprincipals of a technology termed in vivo expression technology (IVET),disclosed in U.S. Pat. No. 5,434,065, and herein incorporated byreference. As alluded to previously, the IVET methodology suffers from anumber of technical shortcomings which limit its selectivity asdiscussed below. The modifications also discussed below address theseshortcomings and provide a number of coding sequences which are inducedin vivo, and can be used as probes to identify other in vivo inducedgenes.

First, preliminary genetic and sequence analysis of in vivo induced(ivi) fusion join points revealed that some of the cloned fragments arecomprised of small (e.g., 50 bp-100 bp), multiple inserts that haveligated at least two unrelated pieces of DNA together, makingdetermination of the actual in vivo induced genes problematic. Second,the parental purA deletion, which is the basis of the IVET selection,was isolated as a Tn10-generated event, thus leaving a transpositioncompetent IS10 element at the join point of the deletion, which extendsfrom purA into an undetermined amount of adjacent chromosomal material,see Maloy S. R., et al., J. Bacteriol., 145(2):1 110-1112 (1981). Thisdeletion-containing strain has a slight growth defect even in thepresence of exogenous adenine, suggesting that the adjacent chromosomalmaterial that was removed contributes to the slow growth phenotppe.Also, the transposition competent IS10 element at the deletion joinpoint contains an active promoter that reads outward into adjacentchromosomal material, see Ciampi, M. S., et al., Proc. National Acad.Sci., 50:16-20 (1982). The transposition of this mobile promoter couldunnecessarily complicate the IVET selection process. Finally,streptomycin resistance (SM^(r)) was used both as a counterselectablemarker upon mating the initial pool of recombinant plasm ids from E.coli into S. typhimurium and as a selection against normal flora presentin host tissues (e.g., normal flora in the small intestine). The SM^(r)mutation renders the parental strain somewhat attenuated in vivo. Theparental SM^(r) mutant used in all of the IVET selections to date areslightly attenuated when delivered intraperitoneally and even more sowhen delivered orally. Such parental attenuation can affect the classesof genes that answer the selection, particularly after an oral deliveryof integrated fusion strains.

Taken together, the shortcomings uncovered with the current IVETmethodology warrant consideration. Consequently, the method disclosed inU.S. Pat. No. 5,434,065 was modified as discussed below to produce invivo induced fusions that circumvent the concerns addressed above. Thefirst modification discussed below was implemented for the constructionof all pIVET vectors, that is pIVET1, pIVET2 and pIVET8, while thesecond modification was only applicable to pIVET1 and pIVET8.

CONSTRUCTION OF pIVET1, pIVET2 AND pIVET8 VECTORS

The pIVET1, pIVET2, and pIVET8 vectors were constructed as described inU.S. patent application Ser. No. 5,434,065, incorporated herein byreference, using the following modifications.

First, for each vector the random fragments of chromosomal DNA were sizefractionated. Random fragments of S. typhimurium DNA, obtained bypartial Sau3A I restriction digestion, were size fractionated andremoved from agarose gel after eletrophoresis. The cloning of largechromosomal fragments increases the probability that in vivo inducedpromoter regions will be contained in the initial pool of recombinantclones that will be integrated into the bacterial chromosome. Thismodification further decreases the probability of multiple inserts sincethe ends available for ligation will be limited to large fragments (1 to4 kb).

The second modification was only necessary in the pIVET1 and pIVET8selections. One way in which a purA mutation may be obtained byconstructing a purA deletion in vitro that is associated with anantibiotic resistance marker. To perform the IVET selection in as nativea parental background as possible, a purA deletion can be constructed invitro. The wild-type S. typhimurium purA gene can be cloned bycomplementation of a purA deletion (on minimal medium) with a pool ofrecombinant clones representing the S. typhimurium chromosome. Once thewild-type purA gene is isolated, a purA mutation is constructed invitro, by introduction of a DNA fragment encoding an antibioticresistance marker (e.g., tetracycline) into the purA coding sequence.The tetracycline resistant mutation is then crossed into a chromosomalpurA gene by introduction of the cloned insertion-bearing plasmid intowild-type S. typhimurium. The phenotype of the desired purA′::Tc^(r)::′purA recombinant is PurA⁻ Tc^(r). Additionally, the Tc^(r) insertions inpurA, thyA, or near purA⁺, in the pIVET1, pIVET2, or pIVET8 selections,respectively, alleviate the need for the attenuating Sm^(r) mutation asa counterselectable marker. In the alternative, insertions of atransposition defective transposon, e.g. Tn10d-Tc, in purA or thyA canbe used as described here.

The implementation of these two changes to the current IVET selectionprotocol resulted in the construction of random individual pools ofpIVET1, pIVET2 and pIVET8 fusions having 1 to 4 kb fragments of S.typhimurium DNA that contain very few multiple inserts. Each pool wasthen integrated into an otherwise wild-type S. typhimurium strain thatcontains a purA mutation, or thyA mutation in the case of pIVET1 and 2,respectively or a drug resistant mutation near the purA gene (e.g.,Tet^(r)) in the case of pIVET8. Theoretically, using this revisedprotocol, there are no a priori limitations either to the mode ofdelivery of these integrated fusion pools (oral, intraperitoneal,intramuscular, etc.) or to the type of tissue from which themouse-selected fusions are recovered.

A total of 100 BALB/c mice (Charles River Laboratories) were infectedeither orally or intraperitoneally with approximately 5×10⁸ cells or 10⁵cells, respectively, using either pools of purA-lac fusion strains i.e.,pIVET1, thyA-lac fusion strains i.e. pIVET2, or cat-lac fusion strainsi.e., pIVET8. Three days after infection, the mice were sacrificed andtheir internal organs removed and homogenized in 2 ml of sterile saline.The homogenate was grown overnight in LB containing ampicillin and 10⁵cells were injected into a second set of mice, where the process wasrepeated. In addition to infecting mice, the cat-lac fusion strains wereused to infect RAW 264.7 tissue culture macrophages for two or threehours. The bacterial cells recovered from the organs and macrophageswere plated out on MacConkey Lactose indicator medium and approximately2,894 white colonies were picked for further identification, daterepresented in Table 2.

TABLE 2 Route of Total Colonies White Selection Administration TissueScreened Colonies purA-lac Intraperitoneally Spleen 60,000 386  Liver 8,000 34 Intestine N/A N/A Oral Spleen 16,000 97 Liver  8,000 26Intestine 60,000 494  thyA-lac Intraperitoneally Spleen 16,000 34 Liver 8,000 14 Intestine N/A N/A Oral Spleen  8,000 32 Liver  8,000 48Intestine 16,000 119  cat-lac Intraperitoneally Spleen 30,000 764 Tissue Culture Macrophage 30,000 846 

Identifying in vivo induced Genes

In order to identify the in vivo induced genes, a genetic approach toclone the 2,894 selected in vivo induced fusions directly from thebacterial chromosome using phage P22 transduction was implemented, seeMahan M. J., et al., J. of Bacteriol., 175:(21): 7086-7091 (1993),incorporated herein by reference. Briefly a bacteriophage P22 lysate ismade on the fusion strain of interest and used to transduce a recipientstrain such as MT189, that contains the replication protein, Pi, whichis required for autonomous replication of the pIVET1, 2, and 8 vectors.After introduction of the linear chromosomal fragments containing theintegrated fusion construct into a Pi containing strain, the transducedfragment circularizes by homologous recombination at the region ofduplication defined by the cloned S. typhimurium DNA. The circularizedfragment can then replicate as a plasmid in the presence of the Pireplication protein, resulting in the cloned fusion of interest. Inother organisms where cloning by transduction is not possible, thefusions can be cloned by more standard methods (S. Berger, et al., Guideto Molecular Cloning Techniques, Academic Press, Inc. (1987).

Plasmids from the recipient strain are isolated and used to transform E.coli cells following standard calcium chloride or electroporationprocedures, see T. Maniatis, Molecular Cloning: a Laboratory Manual,Cold Spring Harbor, N.Y., (1989). DNA mini preps are performed followedby restriction digests. 1,037 clones containing the purA-lac fusionswere digested using BamHI and EcoRI; 247 clones containing the thyA-lacfusions were digested using BamHI and EcoRI; and 1,610 clones containingthe cat-lac fusions were digested using BamHI and Sal I. Restrictionenzymes BamHI, EcoRI and Sal I were obtained from New England Biolabs,and the digests followed the Manufacturer's instructions. The DNAfragments resulting from the digests were separated on agarose gels andcompared to one another for redundancy. 250 individual clones from the2,894 clones digested were identified as having different digestpatterns. Using primers homologous to the 5′ end of the purA, thyA orcat gene approximately 70-400 base pairs of S. typhimurium DNA weresequenced immediately upstream or 5′ to the purA, thyA or cat gene ineach of the respective cloned fusions.

SEQUENCE ANALYSIS

The purA, thyA and cat primers used for sequencing were 5′-CATTGGGTGCCCAGTACG-3′ (SEQ ID NO.: 1), 5′-TGTGCCTTCGTCGAGCAC-3′ (SEQID NO.: 2), and 5′-CAACGGTGGTATATCCAG-3′ (SEQ ID NO. 3), respectively.Primers were purchased from Operon Technologies (Alameda, Calif.).

All DNA sequence analysis was performed by the dideoxy nucleotide chaintermination method of Sanger et al. (1977) with double stranded plasmidDNA as the template using a Sequenase kit (United States BiochemicalCorp., Cleveland Ohio) as per the manufacturer's instructions. Primerannealing was as follows: 10 μg of double or single stranded DNA wasdenatured in 80 μl of 0.2M NaOH at room temperature for 5 minutes. Threepmol of primer and 8 μl of 3M sodium acetate were then added. 200 μl of100% ethanol was then added and the mixture placed on dry ice. After 20minutes the mixture was centrifuged in an Eppendorf 5415Cmicrocentrifuge (Brinkman Instruments, Westbury, N.Y.) for 10 minutes,the ethanol was removed, the pellet carefully washed twice with 200 μlof 70% ice-cold ethanol and taken to dryness in a Savant Speed VacConcentrator (Savant Instruments, Faringdale, N.Y.). 2 μl of 10× stocksequencing buffer and 8 μl of water were then added to the dried pelletand the labelling reaction performed.

20 cm or 33 cm×60 cm 6% acrylamide-7M urea sequencing gels (CBSScientific Inc., Del Mar, Calif.) were used to obtain sequences startingtypically from 20 to 30 bases from the priming site out to about 300bases in a single loading. Similar results were also obtained usingwedge gradient gels with a spacer to wedge ratio of 1:4 in a singleloading. Priming was with ³⁵S dATP (1000 Ci/mmole, DuPont NEN, Boston,Mass.). Gels were removed from the glass plates with 3 mm Whatman filterpaper (Whatman Ltd., Madistone, England) and dried; a readable sequencecould be obtained often after an 18-24 hour exposure using Kodak BiomaxMR film.

Analysis of nucleotide sequences from one strand reading from the 3′direction to the 5′ direction were performed using a Power Mac 7100/66computer and the Wisconsin Sequence Analysis Package Version 8, programavailable from Genetics Computer Group, Madison, Wis. About 50% of thefusions are in genes that show no significant homology to sequences inGenBank version 72. As only one strand was sequenced, the sequenceresults (SEQ ID NOS: 4-254) represented below in Table 3 have anaccuracy of approximately 95%.

TABLE 3 SEQ ID NO LENGTH PARTIAL 3′-5′ SEQUENCES OF PROBES OF THEPRESENT INVENTION 4 390 GATCCGGATG GAATGGCTCC AGCGCGTCGG TTTTCTCGCCGACACCGAGG AATTTAATCG GCTTGCCGGT GATATGACGA ATAGAGAGCG CCGCACCGCCACGCGCATCA CCATCAACTT TGGTCAGCAC CACGCCGGTT AACGGCAGCG CTTCGTTAAAGGCTTTTGCG GTATTCGCCG CATCCTGACC GGTCATCGCA TCGACGACAA ACAGCGTTTCTACTGGCTTG ATAGAAGCGT GGACCTGTTT GATTTCGTCC ATCATCGCTT CGTCAACATGCAGACGACCG GCGGTATCCA CCAGCAGCAC GTCGTAGAAT TTGAGCTGCT TCTTGGCGGTTGACAGTATC ACGTTCTGCG AAATCAGACG GAGAATCACG CAATTGTACA 5 238 GATCATAGAGGTGGATACGG CTTTTCAACG CCTGTTGGAC GGCGTGCCAG TCGGCCTGTT CAAAACGCTGCTGCGCGCCG GAAGTCACTT CCAGAAATCG ACCATACTGC GCGTCAAAGC CTTGCAGGATGGTTTGAGCA ATCAGTAATT CCAGGCCACG CGGCATTTTT TTACCTCATC CGGCACCACGTCATGCCGGA TGCGCGTTCG CTTATCCGGC CTACGCTATC TGTAGGCC 6 309 GATCGAGAGGATGCGGTGGT GGATGCGCAT ATTACCGGAT GACGGCGTGA ACGTGTTATG CGGCCTACCAGCCCAATGCG CGATACCAAG CCGGATAAGC CGCCAACGCC CACCCCGGCC CCGCCGCGTATTTAATCAAG TTATTACCTT TGATCGCACC CTTGAGGTCA GGCGCGTGAT AAGTTCGTAAGCACTTACTT TTGTCATTTC AGCGATACGT TCAACCGGCA GACTTACCCA TAGACACGATCGCGGTATCT CGGTTGCCAA TTCGAATCTA TCCATGGACG CGACATCGAC TACGACATT 7 362GATCCGTTTT GACCATCCCG TGTTTGGTCG AAACCGTGCA GCCTTCTACC AGCGGCAGTAAGTCGGGCTG TACGCCTTCG CGTGAAACAT CCGGGCCGGA GGTTTCGGCA CGCTTACAAATTTTGTCAAT TTCATCGATA AACACGATGC CGTGCTGTTC AACCGCGTCG ATAGGTCCTGTTTCAGCTCT TCCGGGTTGA CCAGTTTAGC AGCCTCTTCT TCAACCAACA GTTTCATCGCGTCTTTAATT TTCAGCTTAC GGGTTTCTGT TTCTGACCGC CCAGGTTCTG GAACATAGACTGCACTGCTG TCATCTCTCA TGCCGAGCCA TATCTCTAGC CATCGGCGCA GTATTGACTT TA 8206 GATCAAGAAT GTGTTCTCCC AGCGCATCCT TGATGGTTTC TCCCAGCACC TTGCCGAGCATACTGACATT ACTAGCAACG CGGAATATTG TTCGTTCATA TGCCCCCAGA CGCCCCATCTTTAATGTAAT TGCCCTGTCT CTTTCATGCC ACAGCGCAGT GGCTGCGTGC GTATGCAGTTATGCGAATGC TCGTGCTGCG ACTAAT 9 250 GATCGTCGGT GCGAATGGTG ACGTCGGCAATCTCTTCGTA CAGCGGATTG CGTTCGTTAG CCAGCGCTTC CAGAACTTCG CGAGGCGGTGCTTCAACCTG CAACAGCGGG CGTTTTTTAT CACGCTGCGT GCGGCAGTTG TTTTTCGATCGGTCGTTTCA AGGTAGACCA CGACGCACGG CGAGAGACGG TTACGGTTTC ACAATTTTACAGAGCCACAT CGGAACACAC ATACCTTTAT ATCTATACTT 10 176 GATCCAGGCT TCGCGTTCTGATAGCTGTCA TACGGTACGG TGGTGATTTC CGGATGCTTA TCCATGATGA ATTTCTGGTGTCGTCGTACC GTTCTGTACG CCGACTTTCT TGCCTTTCAG TTGATCAACG CTGGTGTATTGCCTGCTGAC CACGAACAGC GTGAGTAGGG TATATG 11 312 GATCTTCCGC CCAGCCTGCGACTTCTACTT TCGAGGCCTG GATTTCGAAA CTTTGCCCCT GTGCCGGCGA CGCGACAACCTTACCTGTTA CTACCACGGA GCAGCCTGTC GACAGGTGTA ATACTTCTTC ATTATAATTGGGCAGAGAAT TATTAATGAC AGCCTGTACA GGATCAAAGC AGGAGCCGTC ATAAACGGCGAGGAAGGAGA TGTCCAGCTT TTGAATCTCG GTCGGGTACG ACCCATCCCG CGCAGTGACTTCTTGGTCAA CGGCTACTGG CCTGGAGTAC TGCGGCTACG GCACACGTCA TA 12 289GATCCCAGAT AATCGCCAGG ATCACCATCA CCACCGTTGG CATCAACCAA GCCAGTCCCTGTTCCGCCAG CGCAAACGCT GACTCCAGGC TGGCAGCATA TCGCCGAAGG ATGCTTTGATGCCGTCAAGG ATACCAAAAA GCAGACTGAT AAACATGGCC GGCGCCGATG ATACGGGTGGAATTATGCCA CCATGAGCGG GTAAAACTTA ATACAACCAG TGCGATACAC GGCGGATAGATAGCGTCATG ACGGAATTGG AGATTATCAG ATCGCTCAGT CGAGGTTGA 13 240 GATCAATAATGTTATCCCGG CTTAACACTT CATCCGGGTG ATGCGCAAAA TACATCAGAA GATCGATCAGCCGTGGTTCA AGAGTAATCT GGCGTCCCTG ACGACTGATC TGACCAACAG AAGGTATAACCAGCCACTCT CCAATGCGTA CAACAGGTTG CTGCATAAAA AGATGCCTAA CGAGCTAAGTCATACGTATA TACACGATTG CACAGACTTT TATCCTTTGT AAGAAGCTAA 14 260 GATCAGAACCTTAAAACAGC GTAGACACTT TTTTGGCTTT GTGAGAAATC CACGGACAAT TCCGCGAGCCAGTTATCGAC GTAGAACAGA GGAAGGGAGG AGCCCTTGCC GAAAAGGCCA TCCCATGGTGAATCGGGAAC GCTCCGGTTC CCGTTAATGC CTAATAATTA TCGTAATATA AACAACCGGAAATCAGTATA GGCCGCAATT TTGACGATTC ACCGAAATTG TTAGCGTGCT AATTACAGAGTACAGTTAGT 15 314 GATCGGCATA CAGCGCGTAC ACTTCATCCA GACGTTTGAG GGCGTTAACCACTTCCGAAA CGGCCTCTTC AATCGACTCG CGTACCGTGT GTTCCGGGTT TAGCTGAGGTTCCTGCGGCA GGTAGCCAAT CTTAATGCCG GGCTGCGGGC GCGCTTCGCC CTCGATATCTTTATCGAGCC CCGCCATGAT GCGCAGCAGG GTAGACTTAC CGGCGCCGTT AAGGCCCAGCACATCCGATT TGGGCCCAGG AGAGCTCAGG CAGATGTTTC AGATATGACG TTCAGACACTGCGAACCGAT GCTGATAGAT GAGC 16 350 GATCGCCATT CTGCTAACGA CTCTGACGCTGGCGCTGCTC TCCAGGCTGC ATCGGTTATA ACATTCTGGC GACACGGGCA AAACGCGGCTGTCGCCAGTC TCTGTCAGAA ACGGTAATCC ACCGCCATAA AGTAACGACG TCCGTCTTCGGTATAACCGT AGTCGTCGCG TTTGAGATCT TTATCGCCCA CGTTCAGAAC GCCCGCACGCAGTTTAACGT TTTTCGTCGC CTGCCATGCC GCGCCGGTAT CCCAGACCAC GTACCCGCCCGGCGTTTTTC GCTGTTTGCC TCTGTCGGCC CGCTTACGCC GGTATAATTC CTGATACGTAGATGACAGTT GAGCTGACCG 17 336 GATCGTGCAA ATGCGCGCTA AAGGTGGCGG CGTCCATAAAGCCGGTGACT CGCGATTGCG GCTGTTCCTG GCCTTGGGTA TTAAAGAACA GAATGGTGGGCAGCCCGAGG ACTTGCAGAT GCTTTAACAG CGCGACATCC TGCGCATTGT TAGCGGTGACGTTAGCCTGC AAGAGCACCG TGTCGCCGAG CGCCTGCTGG ACCCGCGGAT CGCTGAAGGTATACTTTTCA AACTCTTTTA CAGGCCACGC ACCAGTCGGC GTAGAAATCA GCATAACGGTTTGCCTTTGG CCTGCGCCTG ATTGAGTTCA TCCACGTAGA ATAGCCGTGA ATTGAG 18 286GATCCGCGAG GTGCGCCAGT TGCACCATCT CCAGCAATTG CGTCACTTTG TTTTAATCGCCGCCGCCGCA GTTGGGCGTC GCTCGCGCAG ACCGTAGCCA AAAGCGATGT TGTCAAACACCGTCATATGG CGAAACAGCG CATAGTGCTG AAAACACAAA ACCGACTTTA CCTACTGGTGAGGCGCTAAC GTCGTACGTG GAAACGATAT ACCGTGGACT GTGTCAGCCC GGCAATAATCCCGGCTGTTT GCGGAACTAC GCACAGGACA TTGCGAGATA TTACGG 19 325 GATCGCGAAAGGCGTACATC TCACGGAATT TCCAACCGGT ATCAACGTGC AATAGCGGGA ACGGCAACGTACCCGGATAA AACGCCTTAC GCGCCAGATG CAGCATGACG CTGGAGTCTT TACCAATGGAGTACAGCATG ACCGGATTAG CGAATTCCGC TGCCACTTCA CGATAATGTG ATACTTCGCACAGTTGCGCA GTGGTGAGTC GTTTTGATCA TACGTCTTTG CATCGTTTTG CTAACTGATACGACTAGGCG GTATATCGAT GATGTGTCTA GATACGCACA TCACACCGAT CCTGCAATTCACGTACACGA TCTGC 20 200 GATCAGGTGC GGTCGGTAAT TGACAAAATA TGGGCAAATGGCCACGACAT TACCCCTTAA TTGATTGGCA GCAGCTCGTG GCTGATTGAT TTTAGCCGGAGCCGGACGCT CCGATTTTGG CGTCAGATAC CAATAACCCA ATCCATGAAT ACACACGACAAGTATACGGG TTACACACAG TATACATCGC AGATCGCTGT 21 264 GATCGGTTTT ACCCTTCGTCCCTTTGATAT AACGCGTGAC GCCGTTAACG TACCGCCAGT GCCGACGCCG AGATAAACACATCCACCTGA CCATCGGTCT CCAGAGTTTC CGGGCCGGTG GTTTTTCATG GATTCTCGGGTTGGCAGGGT TGCTGAACTG CTGGAGCAGG AGATATTTTT GCGGATCCGT GGCGACAATTTCTTCGGCTT TCTTGAATAG CGCCTTCATC CTGGCCTTGT CAGCACCAGA TTGGCTATGC TTAG22 324 GATCAGAATC TATGTTGTCA CAGATTAATA GTTTATTATA TATTTCATCA AAATAATCGACGTCAAGTTC TTTGTTTTTA TTTAGAGTGA ATACTTCCTG TCGTTTTTTA TCGTTTACATAATCGACTAC CGTAACTGCA ACATTCTTAT TTTTTTGTTT CTCTATACAT AGTAATATGGTGTCAAGTTC AAATTTTATT TCTTCAAATC GCAAATCAAA GAAAAAATCT ATATTTTTATTTAAAATCGT TGTCAATTAT CTTTAAAACG ATGTTTTACG TAACATTGTC GTATATATCGTCTGAGTCTA ATCAATATCA TAGT 23 276 GATCTTCGCC TACCGGCACC AGATTGGTTTGGTACAACAG AATGTCTGCC GCCATCAGCA CCGGGTAATC AAACAGGCCG GCGTTAATGTTTTCCGCATA GNNNCAGATT TATCTTTAAA CTGCGTCATA CGGCTCAGCT CGCCGAAATAGGTATAGCAG TTCAGCGCCC AGCCAAGCTG CGCATGTTCC GGCACATGGG ACTGAACGAAAATAGTGCTC TTTTAGGATC ATACCACATG CCAGGTACAG NNAGATTCCA GGCGTTTACG TAGTGT24 329 GATCCGGCGC CGGAGCCACC ACGCCTTCAC GCGGGGCTCC GGGTTCGGCG CGGGCAGATTCATCAGCTTC GCCAGAATGC TCGCCAGCTT CAGGCGCATT TCCGGGCGGC GGACTATCATATCAATAGCG CCTTTTTCGA TCAGGAACTC ACTGCGCTGG AATCCTGGCG GCAGTTTTTCGCGAACGGTC TGTTCGATAA CGCGCGGGCC GGCGAAGAAT CGAGACTTTT GGCTCGGCGATGTTGAGATC GCCAGCATCG CAAAACTGGC GGAAAAGGCC CATTGTCGAT CGTACTACGAAATGTAGGGC AGACGCTCTG CATTTAGAC 25 222 GATCCCTAAC ACCCGGTCAG TTCCCGACAGGCCGGTCTTT TCTACTAGCT GACCTATCAC AAAATTCACG ACAGCGCCGA TCGATAAGCGTCGCGATAAA CAGTACCGCG ATACGAATTC CCATTACGAA CCAGTTCGTC TTCAAAGCCCGTAAACCAGA CAGACAGGTA AGTGTAGTAG TGACTGGCGA CAAAGAAGCA CACCCACGTACCAGCATACG TC 26 166 GATCAGTATA CAACTATCAG TAATTCGACG ATAGACCGAAGTGTGCTTGC TGGCGCTTTA TCGTCAAGGA TAATTGCCGC TTTGACGGCC TTCGCGCTTCCTGCCAACTG GCTTCGTCTT TGTGCATGAA TCACCGCCAG CGGCTCTGCC GCTCGATNTG TCGATC27 333 GATCGCTTAA CAGATAATGA CTGGCGCTGC GGGGCTCCAG TACGATATAG CCGCCTAGCAACACGACAGG CGCGCTTTTA TGGTTCAGGT CGCGACGAAT GGTCATTTCA GAGACGCCCAACAGGGTCGC GGCTTCTTTA AGATGAAGTT TATCGCTGCG TTTTAAGGCC TGCAGCAATTGACCAATAGC GTCGTCGCTC GGCTTTCCAT AGTTCCCCTG GAGAGTTAAA TAAGCGCTCCGCACCATACA GAGCGCTTAA TATTACTCTT TTTTGCGCTA TTTAGTCACG TACCCAGCCTTTTCGAATGG GCAATGCAAC AGAACGTACA CGT 28 221 GATCGCGCTC AATCGCTTCCGCCGCCAGTT TAGCCGCCAG CTCCGGCGTT TTTTCATGCA CCAGAGCTTT CTTAAGCGCTTTTGGCGTAG CACCACTTCT TTGGTTTGTA CTACCGGCGT GGTGGCCTTC CAGCGATAAGCCTCTTTCTT TACTGGCGGT TTCCAGCGGG ACGGNGGGNT GTACNNTCCG AAACCGAGGAGCGTCAGNAG AGTTATTACG G 29 368 GATCGTCGTA CCGCCAACCG AGCCGCCGGGTATGTGTCGT TAAACTCTGT CGCCAGACCA TAGTTAGAGG TAATAGAAGC CCCCCAGCCAAACTGGTCGT TAATCGGGGC GACAAAATGG ACGTTCGGCA CCCAGGCCGT CAGCGCGATGTTATCCGCAT CTAACGTCCG ACGAGATGGC GATGTCCCGC TAATATTAAC ATCAGGATCAATATAAACGC GCCCGCTGAA AACGTCGGGC GGTCAAACAT GTATTACGCG GGTGCGCTACGTACGCATCA TCTGCGATGC GCTCACGATA GCGCAGCAGA GAGAATCGTA CTGAGCTCGCGACAGTGTGA TGTCGATCGG ATCGCGCTTT GCAGTTTG 30 288 GATCTCCACA AACTGTTCCGGCTGAGCGAT AGCTTAAGTA GCGCATGTTT CCTCCAGGTA TGGAAATGCT CTGTGAGGCGGTAAGTCGAG CCCACGTACG GCCCCTGCTC CTTCTTACCC ATGCGCAGCA TCTTCTTCATACAGACGCGC CGCCGGGTTC GAGACCACAT TCGGGTGCAG CGGGTTAGTG CCCAGCGGCGTTTCATCGCT CGTAGTGTCA GGAACGCCTT CGCATTATCA TAGCAAACGA ACGTTCCAGCCCTTTCGCGT CATGAAAGAT GCGTCCGG 31 254 GATCAATAAC CGCATCGTTG TAGAAGTTCCCCTGCAATTT CANNNNATCC AGATAGTTGT TCTGGCTCAG GCCGACGGAA GAGAAGCCACGGATAATCAC GAAGTCATAG GTATTGGAAG CGCCGCGCTG CTTACCGTTA CACCCGCGTGTAACCCAACG CTTCTTTACT GACTGGAATT GATGCATCTG CATCTCTTCG TTAGTGACCACCGAAACCGA CTGTGCGTTT TTCGATAGTA TCAGTTTGTG TGCG 32 176 GATCTTGTTGGCTCGCCTCT CCCCTCGGAC AACACGGTAT AAAACGCGGT GATAGAGCCA CCGCCGTGGATGCCATTACC GGCACGCTCG ACCAGCGCCG GCAGCTTTGC GAACACCGAG GGCGGATAACCTTTGGTGGC TGGCGGTCGC GATTGCCAGC GCATTAGTGC ATTGAT 33 338 GATCGTGATATTCAATGCAC GCCTGCAGCG TGTTTTCGAT AAGCGTGGCG ACCGTCATCG GGCCGACGCCGCCCGGTACT GGCGTGATGT ATGACGCGCG CGCCCGGGCT TCGTCAAACA CGACGACGCCAACGACCTTG CCATTTTCCA GACGGTTAAT ACCGACATCA ATCACAATTG CGCCTTCTTTAATCCATTCG CCGGGAATAA AGCCCGGTTT ACCTACGGCG ACAATGAGCA AATCAGCATGCTCGACATGG TGACGCAGAT CTTTGGTAAA GCGTGCGTAA CGGTAGTCGT ACAGCCAGCCAGCAACAGTC ATGCTCATTG GGCTCAAC 34 319 GATCTTGCAG CGCGCCGTGC CAGGCATAGCGCACCTGCTC ATTAAAGACG TTCGTTTTAC GTGAGTTCGG TTTCGGCGTC GGCTTCTGGCGTGCTGGCGC GTTGCCGCCG CCTGTTCCGC GCGAGACTTA CGCAGTCGAT CCAGCCGTGCGCGAACTGCC TGATTTGGTT AATCGCGTGG GCCTATTCAT TGGCCAGGCC ACCATGCAGATGTCCATCGT CAGGACGAGC TGCCTATAGG AACGACGGGA CATAAGTCCA ATATGTGCGAGCGTCAGTAC CGTACCCTAA GTAAACTCTT CAACAGAAGT AAATGCCTT 35 418 GATCGATTTGCGCTGGCAGG TTGCTGCCGG TATTGACCTC TTTGTACATA TTCAGCGGCG CGTTCTGCGAGTAGCGCAGG TTATCTTCGA TATAGGTATT AAACACGCCT TTGGAGAGCG CGGCTTCATCACCGCCGCCC GTCCAGACGC GTTGGCCTTT TTTACCCATG ATAATCGCCG TGCCGGTATCCTGGCAGGTC GGCAGAATGC TTTGGGCGAT CTGCAGGTGG CACTTTTCGG GGAAATGTGCGCGGAACCCC TATTTGTTTA TTTTTCTAAA TACATTCAAA TATGTATACG CTCATGAGACAATAACCCTG ATAAATGCTT CAATAATATT GAAAAGGAAG AGTATGAGTA TCACATTCGGGCTATCTTTG GATTCTCGTT GACACAGAAC GAGGAAGAAG CGAGACAT 36 350 GATCAAGAGTCAGGGGTAAT TTTACCTTTT GCATAGGGCG CGCATATTAA CTTCGTAACG TCATATAGTCAAAGAAAAAG GCAGCCTGCG GTTGCCTTTT GCCAATAATT CGCACACATT GCGGGTTACAGACTTATTTT CGCTCAAGAC GAGTCAGTAT GACAGGCTTG AAGACCGAAG AGCTATGTTTAAGATGGCTC TCATCATTAC GCTATATCTG AGGGAAAAAA TATGCCCCGT CTCATCCTTGCGTCTACCTC TCCCTGGGCG TCGCGCGCTG CTGGAAAAGC TGACGATGCC TTCCGATGCGCGCGCGATGT GATGAACCCA TGCCGGGCAC GCGCTCAGTG 37 270 TGCGACAACA CACCCGCCAAAGCCGCCGCC GGTCATGCGC ACGGCGCCTC GATCGCCGAT GGTCGCTTTG ACGATGTCTACCAGCGTGTC TATCTGCGGG ACGGTAATTT CGAAATCATC GCGCATTGAG GCATGGGACTCCGCCATCAG TTGGCCCATA CTTCGAAATC ACCTTTCTCC AGCAGGCTTG CCGCTTCAACGGCGGGCATT TTCGGTCAAT ACATGGCGAA CCGTTTTCGG ATACCGGGAC AGTTCCGTGGCAACGGCATT 38 280 GATCCAGTGC TTTCGCCGCG TCATCCACAA TGACGTCAAA GCCAAAGGTTTCGGCGCGAG TACGCACGAC GTCCAGAGTT TGCGGATGGA CATCAGAGGC GACAAAGAACCGGTTGGCAT TTTTCAGTTT GCTGACGGCT TTGCCATCGC CATCGCTTCA GCGGCGGCGTCGCTTCATCC AGCAGCGAGG CGAACGATGT CCAGCCCTGT AGTACAGCGT ACTGTTGAGTTACAGACTCA AACTAAATCG TATAGATTTA GCCTACACTG ATTTACATTA 39 275 GATCATCGCCTTCAAATTGA CCTGCTTGAG ATCGAAAATG AGCTGCGCTA AGTCCTCGAT AGAGTAGATAGCGTGGTGCG GTGGCGGGGA GATCAGCGTC ACGCCCGGCA CTGAATACGC GAGTTTAGCGATATACGGAG TGACTTTATC CCCCGGCAAC TGACCGCCTT CGCCGTTCGC CTCACTTTAATCTGAATCAC ATCGGCATGA CAGTAGGTCG GTCACAAGCG CGACGACTCT ATCGCAATATGTCAATCCGG TCCTACATAT CATTT 40 333 GATCTTTCGA CTCGATGTTG GCGACGAAGATAAAGTTCGG CAGCAGCTTG CCCGCGTTGT CATAAACCGG GAAATACTTC TGGTCGCCCTTCATGGTGTA CACCAGCGCT TCGGCAGGCA CGGCGAGGAA TTTCTCTTCG AATTTCGCCGTCAATACCAC CGGCCATTCC ACCAGCGAAG CTACTTCTTC CAGCAGGCTT TCGCTCAGGTCGGCATTACC GCCAATATTA CGTGCTGCTC TCAGCGTCCG TTTGATTTGG CTTAGGCTCGTAGTCGCATG ACTTACGGAC TCAGAGAATT GCGGTACTGT CAGATGTGAG GACCGTACAT AAG 41233 GATCGGGCAT CGGCACGACA CCGGTATTCG GTTCGATAGT GCAGAACGGA AAGTTTGCCGCTTCAATACC GGCTTTTGTC AGCGCGTTGA ACAGGGTGGA TTTCCCGACG TTGGGCAGACCGACGATACC GCATTTGAAT CCCATGATTT AACTCACCTT AATATCTTAA TAATCAACCTGTTATAGAAA ACAGATTGCA GAATGGAATA CTCGCTATTA TCACGCGCGC AAA 42 302GATCAAGCGT GTCCGGCGAA AACGTTACGC GTTCTCGCAG CGATACAGGT GCCGTTTTATGGTTAATACC GAGCGCTAAA AGGGTCATGT CTGCGGGAGT AGTACCAGCG TTGATATGGTTAGTCTGCTT GCATCATACA GGATGCGCGT GGTCAATAAA AGAGAGAGCC CCCTTTTGGAGTAATTGGCA GCGCTCGCTA ATTTGATGAT TTAAGACACT TGAAAGTAGA CGATGTCACCAGGCGCCTAC ATTAAAGGCT ATACTGTACG ATAGCAAAAT TTCCGATCCG CCACTTTCAC TC 43262 GATCTACTTT CGGGATGGCA GCGTATCTGC CGCAATACAC CCTGATGGAT GTTATGCCTGGATCTGATTA CTCTTCTTTG GGCGAAGTTT TCGACCCGGC TCTTTAACTT CTGCCCGGGTCTGAAGGTCA CCACGCGCCG TGCTGTAATA GGAATATCTT CACCCGTTTT CGGTTACGCCCCGGACGTTG ATTTTTATCA CGCAGATCGA AGTTACCAAA ACCAGAGAGT TCACCTGCTCACGTTTCAGA GCACGACGAT CT 44 153 GATCAGGTCC ATATTTGTCT TTGCCTTTCTACCCGACACG TTTCGGGTGT GCGATTCGGA TTAGTCCGCC AGAAATAGCG GGCCCATTGGCGGTTTTGGA AGGTCAAAAA GGTCAGGGTA ATCCACCGCA ACCAAATATA GCCCTTCCGC CTT 45169 GGCGCGTTGG CAGATTTTGC CAGACGACGG GCGATTTCGG TTTTACCGAC GCCGGTCGGCCAATCATCAG AATATTTTTC GGCGTTACTT CGTGGCGCNN CTTCATCAAG CTGCATACACGCACGTTACN ATCNNGACGG AACCTTTGTA TCTGCGATAA TNNTTGTAG 46 282 GATCGCTGTAGATTTTACAA GTCTTCTTCA GCGATACACG TCTGCACAGC AGGCCGAAAC CGGTGTTGATGCCGTAGGAG TACGCCTTCA GGCAACGATA TCATTGACAA CGCGACGTGG CGTTAATACGTCAATGGCAT GGCCTTCCAG CGAAAGCTGT ACGATGAGAT ATGACATGAG AGAGACTTAACTGCCCCAGA GTATATATTG TGTTCATATC AGCCTTTCCT CAACAACCAT CGTAAATTCAGACTTACTCA CACACATTCA CGTAGATCAT TC 47 258 GATCGCGGGT CAGTGTACGCACCGCTTCCG GCGTATTTTT CCCGCTATTA AAATAGAGCT TGTCGCCAAC AATCAGGTTATCGAGATTAA TGACCAGCAG CGTATTTTTC TTCTCAGCGT CACTCATCGT TTGAGTAAATTTGGGGGCCT AGCTTTCCCT CTTCTTCCCC GCTGGTGGCG ATAAAACGAA TCCCGTAATGGGTCGGTATA TCTTTCAGAC GGCGCAGTTC CAGCATAAGC CCTAATCCCG CGGCATTA 48 315GATCGCGACA TGCGCAACAT CTACCAGTTT ACTTAACTGA CTAAACAGTA AGTCGACCGACCGGGGACTG GCAACGGTCA ATTCAATATT TATATTCTGC GCATCGGTCG CGGCTTCCATATTCAATGGA GCACACCTGA AAACCACGAT GGCGCACCAC GCGTAAAACA CGTTCTAAGGTTTCTGGATT ATAGCGTGCC GATACATTGA CCTGATGTTG CATCATGATA TTTCACGATTTCAGAGTCAT GGCGCAGGCG CACACGCAGA CATTTGAAGT CTCGATGAGA CGAGAGACGCCTCAGTCACT GTCGA 49 268 GATCCAACGT CTGGCGTAAT GCCAGCATGT CGTACTGGGTGTTGTTGCCC AGCTCCGCAC GTGGGTCGCC TTTCGCCACC ACGTTGAACG CCAGACCATCTTTAATTTGC GGCGTCGGCC AGCATGGTAA AGCGGTTGCT GAGTACACGC GCTTCACGGAATACCGTGGT GGCTTGAGCA CCGCTCACCT GCTTGAGTCG GCTGTTCAAC TCGGCGTAGTCCCCACATTA AGGCTGGTTG TACACGTCGT TGTTGGTGTA ACCGCGGT 50 296 GATCTAAAATTCAAATACAG GAACAGGGAG TTCTGGTGCA GAGGGTACTA TGTCGATACG GTGGGTAAGAACACGGCGAA GATGCAGGAC TACATAAAGC ACCAGCTTGA AGAGGATAAA ATGGGTGAGCAATTATCGAT CCCGTATCCG GGCAGCCCGT TTACGGCGTA AGTAACGAAG TTTGATCGAAATGTCAGATC GTATGCGCTG TTAGGCGGCT GGTAGAGAGC CTTATACCAT CTGAAAACTCCGTATCCGAG ATATTATAGA CTATTGGCAA CCTGAATCTC TCGATT 51 213 GTACACAGACGCCTTTCAGA TTGGCGATGA CGCATCCATT GAGAACACCC CATCGGTGGC GATCAGGACATGACGCGCGC CGGCCTCACG CGCCTCTTTC AGCCGCGCTT CCAGCTCTGC CATATCGTTGTTGGCATACG CTTCGCTTTA CACAAACGCA CGCGTCAATG ATAGACTGGT TCAGCGCGTCGGAATATAGC GTTCGCGCAG CAA 52 113 GATCGAAACT CGCCACGTTA ATCACCGTCGCCACCACCGG CGGCCAGCGT CCGTAAAGCA GCGCAATCAC CACTACGGCC CAGGCAAATCGATGCATTAC CAGATTGGCG GCG 53 337 GATCTTCCGG GTTAAATTGC AACAATGCTTCGCTAACGCG CAGCCAGCTC CATTTGCGGT TCCTCCATCA GCGAGGATTT CAGCGTATCCAGTAGCTTAC GAATCACTTC GGCGTTATCC GCTTCGTCCA AATCTTCATT AAACAACTCGGCGACCGGAC TAATATTGCC TTTTAACCAG ACTTCCAGAG TATGTTCATC AAGCGTTTTCACCGTTCGAA CGGTTAATCA GCCACATTTC CCCTTTCCAG CGATTCAATA CGCAAATCAACTGCGTTGGG AAGATAACCT AGGCACAACG GCAAATCAAG ACGTTGCATA CATATAAATAGCGCCAC 54 313 GATCATAAAA CTTCCGCGTG TATATGTTGG TTGGAACCGT AGAGATATAGACAGGTGGTT CTACACAGGC GTTTACCCCT ACCGTCGCAA ACATTTCTTT AATCAGGCTTTCTCTTTTTT CTTCTGATGG ATGCGAGTGA TTAAACTCAT ACATTAACGT TTTCCCACGAAGTCTTTTTT CCGGTAAGCC TTCGCATATA TCGGTAAATA GCTTGCCTGC TCTTATCTTTCGGTCATGGC ATGTTCATCG CGATCACTCC GTTATGATAT GTCTCGATAG CCTCGATCCAATGATGCTAC GCATCATCAC TCA 55 300 GATCGAATTC AGATTCCATT ATCGCCATCAGATATTCCAG ACGTTCAGAT TAACGTCGGA CATCTCCAGT ACGGACTGTT TATCCGCCAGTTTCAGCGGC ATATGCGCGG CGATGGTGTC AGCCAGACGT GCAGGGTCGT CAATGCTATTGAGTGACGTC AGCACTTCCG GCGGAATTTT TTTGTTCAGC TTGATGTAGC CTTCGAACTGGCTGATAGCG GTACGACCAG CACTTCTTGT TCACGCTCAT CAATGGCTGG CGAATAAGGTACTCGCTTCG CGAGAAATGT CGCGTGCAGA 56 423 GATCCCACTT CTTGAACTGC TCGAAGCAAACGCCTTCCGG CAGATCATCG CGCGCCACAT ACAGCTGAAT GCGGCCGCCT ACGTCTTGCAGGGTAACAAA AGAGGCTTTA CCCATAATAC GGCGCGTCAT CATACGGCCC GCGACGGACACTTCAATATT CAGCGCTTCC AGTTCTTCAG CTTCTTTCGC GTCAAACTCT GCGTGCAGTTGGTCTGAGGT ACGGTCAGAC GGAAATCGTT GGAACGGATA CCTGCTCACG CAGTCAGCCAGCTTTGCACG TGCCTTATTT ATTGTTAAGA TCGACTACTG TACGCCTGTC TTTGTCAGACATGTGATCTC ATAGCCTGGC TTTCAAACTT GCTCGATATG ATCAGACTAC GTCAGTACGCTGGATGCGTC ACAGTACAGC TTAATCGATC AGA 57 173 ACAGAATCTT TTTCACGACGTTCTCGTTAA TAACCGATAA GACGTGAGGA GTTTAGCAGA TTTAGTGCTT GATTTCGTGGCTTGTTTACA GTCAAAGAAG CCGGAGCAAA AGCCCCGGCA TCGGCAGGAA CNCTTATTTATTAATAAAAT CTTCCCCAAC TAATATCTTT TTT 58 218 GATCCTCCGT GGCATAAGAAATGCCGCCAA GAATCGTGAG TAAGATGTTG AAAGGATTGC GATAACATAC CCACAGATGCACCCACCACG GCGAGGGTTT CTGTGCCGGA ACGGTTTTCG CCATGCTTTT CACGCGCNNTCACCTCGGCA GCGTTTAATC CTCGGTGCGT ATCAAAACCT GCAGAGAGTC TCTGCTCATGCGCGACTTCA GACAGTAG 59 346 GATCGAGAAA AGTGAGCATC CCTTCGATGG TAAGTTCGGTCTCATCCTCC ACACTTAATG TCGGATTGTT CCCGGAACCA TCCAGCTTAC GTGTCGCTATCAGCAATACT CGGAATCCCT GCGCATTGTA ATCTTCGGTT TTCGCCAGCA GTAGCTCGCGGCGTGTTTCC GTCAAGCGCC ACCACACGAT CGCCTTCGCG AAGATGGGTG GCTACCATCATCATCTCTTC AACGGCGCTT TGCAGATCAG GCATCTGTCT CATGCTGCGC ATCTCACAGACGATACCGCG ACGTACAAGT CGATGCAGTC ATCGTTATGA GCCCTTGCGA TGTGCATGAC TGCAAC60 323 GATCCTGACG AATGGCCACA ACGGAAGGCT CATTCAATAC GATGCCTTGT CCTTTTACATAAATGAGGGT ATTCGCGGTA CCCAGGTCAA TGGACAGGTC ATTGGAAAAC ATGCCACGAAATTTTTTCGA ACATACTAAG GGATTAATTC CTTGAAAGCT GGGGCGAAAA CAAAATGCGTTTACTTTACC AACCACACGC AGCAGCGACA AGCGCGAAAA TCATCTGCTA CGTGAATTAGTGCGTCGTTC TTTGTACAAT CTCGCTGAGT CAGCTGAAAA TCACGCGATC TGCTCGTGACTTGAAGATCT CGATTCTCGA CAT 61 276 GATCGCGCGT GGTTTGCAGC GTCGGTTCCACCACCAGTTG GTTAATGCGG TTCGTTTCCA GACCACCAAT CTCTTTCATA AAATCTGGCGCTTTGATACC CGCCGCCCAC ACCATCCAGA TCGGCCTGAA TATATTCACC TTCTTTCGTATGCAGACCGC CTTCGGCGGC GCTGGTGACC ATAGTTTGCG TCAGCGCGAA CGCCAGTTTGGTCAGTTCAT TATGCGCGGC GTGGAGATAC GCGCGCACGA GGCAGATACG CGCAGTCACA CGAGTC62 166 GGGCCAGAGG TATGACTCCA CCAGACCGTC AAAGACGGCG TTGCGTCGTG CTCAGCATAGAAGCCGCGCG CCTGCTCAAC GGTCAGGTGC AGCATTATTA GTGCCCAACA ATTTTGAACCCTGCAGCTTC AAACGCGCGA AAGATCGTCC AATACGTTCT CCGACC 63 425 GATCTTTAGCCGGGCAGACC TCTACGCATA AATTACAGCC AGTACAGTCT TCCGGCGCGA CCTGCAGCACATATTTCTGG CCGCGCATAT CGCGGACTTC ACGTCCAGCG AATGCAGACT GGCTGGCGCGTTCTCCATCG CCTGCGGGGA AACGACTTTC GCACGAATTG CCGAGTGAGG GCAGGCAGCGACGCAGTGAT TACATTGTGT ACACAGTTCC TCTTTCCAGA CAGGAATCTC TTCGGCGATATTGCGTTTTT CCCAGCGGTG GTGCCCATTG GCCATGTTCC GTCGGCGGCA GGGCGGAAACAGGCAGTGCG TGCCGAGGCC CGCCAACATG GGCCGTAACG TTTCAGAAAT CGCAGTGAGACGGCGGCATC CCATAGGATT ACGCTGAGAT CCAGATCTCC AACATCTCAT CTAAA 64 333GATCTACCGG GTGAGCGTAT AACCNATCTT AATCCCTCCC GGTTAGGTTG ACATTAGGATCCTGTTCCTT TCGGGTTATA CTGCGCTGAA CGCGGGTCCA GTCCAACGTG AATACGGCAGATAAACCAGA CCAGCCAGTA ACACAAAAAT AAAAATTCGC AGCTTCCACA AAGCCAACCCAGCCGCTTTC GCGATAGAAG TCGACCATGC GAACAGATAC AGCGCTTCAA CGTCGAAGATAACGAAGAAC ATGGCTACCA GGTAAAATTC GGAGACAGGC GTAAGGCGCG CCGGTGCGACCATTCATCTC CATCCTTTGA ATTACGGACA GCA 65 374 TTATCAATAC CCGCATTTTTACTGAAACCG GGCGTGATGT TTTTGGCTTT GACATTGCGA ATGACGAAAT GTTTGCCATTTTCTACGTGC ACAAGCTGTC GGCAATCAGA TCCGGTAATA TTGGCCACCA CAAAGTTTTTTACTGCCTGG TCTTCAGGAT AACTGTTGTC ATAGGTGCTA CCCGCCAGCC CGATCCCCCAGTTGATTTTG CCATTGGTAC AATTAATGCG TTCGATGACA TGATCGGAAA TCAGGATGTCGCGGTCGTGA TCGCGACATT CCACTCATGG CGTCCCCTGT AATCGCTAAG CGCTATCGTAATCGCGCGCA TCCATTGTTA TGAATCCTGC GAGATGGCGA GTGCGTGGTA CGGA 66 296GATCCTGAAA TGCCCATCCA CGCCAGCTTG GGTATAGAGC AATCTGGCAG TATAAGATTTGGGATGTATT TTGGCCGCAG CCGCAAAAAA CGCGTCTGGG CGATTCGGAC AACCAGAAAGAGGCGCTCTG TAATGCGGTC TGGGCTATGG GACGAATTTC CAGATAATAG TAAACGATTAACCCTACACG AAAGCGTAAC AGAAGCGCAT AACGCCTTTA AAAACCACAG TAACACGCCTGCATTATAGT TTTTCTTACT CAACATCTAT CGTTCGCATA CCGGATGTAA TAGGCT 67 178GATCGGCAAA GGTACCGGTG GTGCCGTCGT AGTTTTCTCC GCGCCGGGCG TTAACGTTCTGGCCCAGCAG GTTGACCTCA CGCGCGCCCT GGCCGCTAAC TGGGCGATTT CGAACCGGATCATCGTCTCA GGGCCGGCTG ACTTCTTCGC CGCGGGTATA CGGCGCACAC GTAAGTAC 68 327GATCAAAAGT TTTCTGCGCC GCCTCGTTCA TCAGTTTATA AGGATTGCTC TGATCCGCTGCCGTTGCTGC GCTTAATGGC GCAATGACCA GCAGGGCCAC CATCATCAGT CGTTTAAACATGCCTCAATT CTCCTGAGAT TATTTCGTTT CGCCCGCGGG CTTGTGGCTT CAGTATGACCTTCCGTTGCG GGCTGGCGCA TCGCAGAATT CTTATTGTCG TCGCCTTCGT GTTATAAGGAACTGCCAATC ATATCTCCAG CACATGCAGA CGGTCTGATC GTACTGCACG CTAGATAGACGTCAGACTCA ACACAACGAG CTAGCGA 69 375 GATCCAGCAG GTTGATTTTT GTTTCTTTGTTAGGAACTAC CGGGGTACTG CTTTCAGGTG TGACAATTTG TTCAGACATA TGCTATTCCGGCCACGTTAT TACACGTTAT GGCCCCTGGA GGTTGAAAAA AGAAACGCCC CGGTAAGCTTACTGCTCGTC CGGGGGCGCT GCATTGTACA AATTCTGGCG TAAGGAGTCC ACGTCTGCACGCGCATTAGC AAAAATAATA TTTGAACCGA TAATTTATCG CCAACGCATT TACAGCGTGAAAGACGAAGG AGATTAACGG GTGGGGGCCA CTCGCTTCAC GAGAAAAGCG ATTCGGCTGGCGATTCAGCG AATCGACGTG TGCGTTCAGT ACTATCACGT AGTCG 70 298 GATCGGACGGCGCCTTATCT TCTTCAATAT CGCGCGTACC GTAGAAACCT TCAGGCAAGG TCGCTCAGCGACAGCCTGCT GGCTGAGTCC GAGTTGTTCA CGGGCATTGC GCAGACGAAC GCCGGTGGTTTGTGCTTCAT TTTGGTCGTG CGTTGCTTCA GTATTCATTC GCTACAGCTA ACGGTACGTGTAAATTAGGA TTCAGGCGCC GACGAGCGTA ATGCCGCCAC GCGCAAACAT CGTAGTACTTAGTCAGACAG TATACGTTAG CGCGCGATAC AGCTAGAACG CTAACTGT 71 234 GATCTCACCTTTTTTTAGCT GCGGCATCGC TTCCAGAGTG GCGACCGCCG GGTACGGGCA AGGTTCGCCAACCATATCCA GACGGTAATC AGGGACGATA TTTTTCATAC AGATTCCTTA GCAGGCGTCAGCCCGCACGG CGAAAAAACG TTTTTTTCCC AGCCGATGAT TAACATTCAG TGGTAAATAACAACAAAGTA GGTGACACGC AGACCGTAGG ACCAAGTATT CAGC 72 317 AGCTCTGATTTCGGTAGCGA TACGTCATCC ATCAGATTCG CCAGCGGATG GACAAACGGC AGGATGACCAGGCTGCCGAT CAATTTGAAC AATAGGCTGC CGAGCGCTAC CGGACGCGCG GCAGCATTGGCGGCGCTGTT ATTGAGCATC GCCAGCAGCC CCGATCCCCA GATTGGCGCC GATGACCAGGCACAACGCCA CCGGGAACGA TATAATCCCG CCGCCGTCAG GTCGCCGTCA GCAACACCGCCGCCACTGGG AATAACTGAT AATAGCGAAC ATCCGGCCAA TAGCGCATCA GCATATGTGCCTGAGAG 73 134 GATCGAGGGC ACAGGAGAAA CGGGCATTTT CGCCGCAATT AGTTGACCTGATCTCCCAAG ACCAAATTTT CCTCAGCCGG AATATACCAG AACTGGTCGC GATATCCGCAAGATCGCGCT TCACGGCGTC GCTT 74 387 GATCGTAATG TGCGGCCAGT TCAAAACCGAAGCGGCTATA TAACGCCGGA TCGCCCAGCG TCACGACCGC CGCGTAGCGA ACTCGTTGAGCGAATCCAGC CCTTCATACA CTAACTGGCG CGCCAGCCCT TGCCCGCGAT ACTTTTCATCGACCGCCAGC GCCATGCCGA CCCACTGTAA ATCTTCGCCT GCACATCAAC CGGGCTAAAGGCGACATAGC CACACTGACC TTCATCATCG TGCACAGTCG AGGTAGAAAA CATCTCACGAAATCGTGAAC AGCTTGCTTC GCATGTTTCG ATGACGGCGT ACACGCGATC AATACAGCGCATCATAGATT TATGATAGAT GTATAGAGTG TGTCTAGAGT TTATCGCTAC ATCGAGT 75 189GATCGTAAGG ATTGACGATT AACGCCGACG TCAGTTCATT CGCCGCTCCG CAAACTGTGACAGTACCAGT ACTCCAGGGT TAGCGGGGTC CTGCGCGGCG ACAAACTGTT TGTGGACCAGGTTCATCCCG TCACTCAACG GGTTACTAGC CCGACGTCTG AATAACGGAA TATACTTCATTAACAGTTT 76 217 GATCACGAAT ATTCATTATT CATCCTCCGT CGCCACGATA GTTCATGGCGATAGGTAGCA TAGCAATGAA CTGATTATCC CTATCAACCT TTCTGATTAA TAATACATCACAGAAGCGGA GCGGTTTCTC GTTTAACCCT TGAAGACACC GCCCGTTCAG AGGGTATCTCTCGAACCCGA AATACTAAGC CAACCGTGAC TTTGCGACTT GGTTTTT 77 275 GATCCCTTCTTTTGCTGATG CAGTAGCGGA CCAGGCTACC ACAAGGGGAA TGATGCAGAC TGCGAAAAAGTTTTTCATTT CAGAACCTGC CTTAATATTG GGCTAAAAGA CAAGTTTCAC GGTATAGGGTATGATATAAC GATTCAATAA ACGAAGCCCA AAAAACGGTC TATTGTAACG CTGGGTTTCTGTAAGCGGGT AAAATGAGAT GAGATTTAAT AACATCAGAT ATCTCGGATG AATCACTCTCGAATCCGCAG CGTCCATCTA CGTAT 78 101 GATCTTCATA CAGGCCCAGA TAGCCGTCATAAATGCCCAT GACTTCCAGC CCTTACGTCA ACGCTGCAAC ACAACACCGC GGATTTTTGATTCATTCTCT T 79 303 GATCCGCACG GATAAAAACT CGTTTCCCGG CCAGATCCAGATCGGTCATC TTAATTACAG ACATGGTGAA TCCTCTCAAT GATGCTTAAA GTTTTGTCGACGCTGACGCG TGAGCCTGAA ACCAACTGCG GCCATCGCTA ACGTGGTGTC GAGCATCCTGTTAGCAAAGC CCCATTCATT ATCGCACCAG ACCTAGCGTC TTGATCAGTG GGCGCACTGACCGGGTTGGG CATCACATGG CGTGGCTGGT AATTTGGACG GTGCATGTAC TCATGATGGCTTGGTTGGCC GGATTGCTTG CTT 80 257 GATCGTGACC CGGATAACGC TCATCATCTTTGGTCAGTTC CGGCGGCGTC ACGGCAAAAC CGCGGCGCCA CTGTTTAACC TGCTCGTCACCATATTTTTC TGCCGTTTGC GCTTTATTCA GCCCCTGCAA CGGCCATAGT GACGTTCATTGAGTTTCCAG GATTTTTTCA CCGGCAGCCA CGCTGATCCA GTTCATCCAG TACGTTCACAGGCTATGGAT AGCGCGTTTC AAGTACGGAA GGTAGGCAAA TCAAGCG 81 290 GATCGAGCAGGCATTGCAGC AGCAGACTTT TGCCCTCCCC GCTGCCGCCA ACCAATGCCA CCATTTCGCCGGGCGCGATA TCAAAAGAGA CATTCTGTAA TAACGGCGAC CAGCGTCTCG CGCCATACCAGCGATAACGG CGCTTTCCAG CGTAACCTGT TGTAAACTCA GATACGTCAC TCCTTAGCACAGCCGCTGAA TGGCGGAAAC TGTCGAAGAG CATCACAGCG TGAATAACAT TAGGCCGGGAATAGACAGCA CAGTTCATGG CTAATAACGT ACCGTCGAGA 82 233 TGCAGATCCA CCTGGAACGGCGGGATGTTG ATCACCTGGG AGGCCAGACC GCTATTACGG CGCATTAACG CGCCATTACCTCTTCGATGT GGAATGGCTT CGTCACGTAG TCATCGGCCC GGAGCTGAGA ACCTCGACTTTATCCTGCCA GCCTTCGCGC GCGTTAACAC CAGAACCGGC AGTGAAACAT CACTCGTGCGCCCACGGGTA TTAAGGAAAG GCCGTCTTCA TCC 83 284 GATCTCATCA AAACGGTTGAGTACCAGCGC CAGGGTCATA CCCGCCTGGT TCAACGCCGT CAGGTGCGCC AGTTGTTGACGGGCGGTCAC GTCAAGCCCG TCGAACGGTT CATCAAGGAT CAATAACTCT GGCTCAGACATCAGCACCTG ACACAGCAGC GCTTTTCGCG TCTCGCCGGT AGAAAGGTAT TTAAAACGCCTGTCGAGTAA AGCGGAAATC CGCGAACTGC TGCGCCAGTA TCGCACAGCG CAGGATGGTGACATATCCTG AATATTCGCG TAGT 84 367 GTTGCGATTA TCCCGCAGCG CCTGCTCGAACAATTGGATT TGCTCAGTGC TTTCATGCCA TAACCAGAAG GTACTGATTA ACTGGAACACCAGCAGAATA AGACCAATTG TCAGCATTAA ACGCTGGCGA AGGGTCACTG CTCTTCGCTGAAAACGCATC AGGCTCACTT AGCTTTCCTC AGTGGCAACC AGCATGTAGC CAAACCCGCGAACCGTGCGA ATGCGACTTG CCGACTTTGT CGCGCAAATT ATGTATAGCA CTTCCAGAGTGTTGGTCGAG GGTTCGTTAT CCCAGTTGTG ATATCGTTAT AAAGAATTTC CGGTGCACGACTGCCTGAGA CTAACCGTGA GAGCACGTAT CTAGCTC 85 320 GATCGTTGAT CGCCTGGATAACAACCTGCT GCTGCTCGTG ACCGAATACC ACCGCGCCCA GCATAGTGTC TTCGCTCAGCAGTTCAGCTT CGGATTCCAC CATCAGCACA GCCGCTTCGG TACCGGCAAC CACCAGGGTCCAGCTTGCTT CTTTCAGCTC GTCTGGGTCG GGTTCAGCAC GTACTGGTCA TTGATGTAACCTACGGCGCG CGATTGGGCC GTTGAACGGA ATGCGGACAG CGACAGCACG ATGCGATCATCGCACGATGA TCAGGTACTG CGTACGAACG ACGTCCGATA ACTCGATGTA CAGCTCGGAA 86 249GATCAATAAA TACTTTACGA ACTTCACTGG AGATTTCCCA TTTAGTGTCA TTTGGGCAGTTTATAAACAA ACGCGCGGTA GTATAAAGGC AAGCCAGACG CATTGATATA CCCGTTAACGCCGACGGGTG ATAAGGAGAT CGACCGTTAT GGCTTTTAAA CCTGGCAAAT AGGATTGCATTATTCCAGCC ATGAAGCGCT GGCCATCGCG TTATTCACGC GCATCGGCTG ACACGCACTGTGCACTGCG 87 275 GATCGCCTTT TGCTGCCAAC GCTGCGGGAG AAAGAGCAGA AAGAGCGAAAACAGCTGCGA CAGCCGCCAG AGTCGATTTG AGCATGAGAT TTCCTTAAAG AGAGCAGAAATAAAGCAAGT GGAATGATTT TAAAGAGCCT TCTGGGCCAG GCAGCCTTTA CTATTTACGTATATGAACAA TGTACGTTAC GACGACGCGT ATCTGCATAT GATGTGACAA CATAATAATAAATGCATGAC ATACTATACT ATATATTAGC TACAAGCTAT GCTCA 88 325 GATCGCCGCGAACCAGCAGA GCCACCAGCG GAGACTTGCT GTCTTTCACC GCTTTCACCA GCAGCGTTTTTACCGTTTTT TCAATTGGCA GGTTGAATTG TTCCACCAGC TCCGCGATGG TTTTGGCATTTGGCGTATCG ACCAGAGTCA TTTCCTGCGT CGCGCTGCGC GGCTTTGCGG GATAGCTTCTGCAGTTCAAT GTTAGCCGCG TAATCAGAAA CATCAGAGAA AACGATATCG TCTTGCGCTTTGGCAGCCTG GAATTCATGC TGGTTGGCGA TAGACGTATG CTGTACGGGA ATCAGCCATAGTGAGATACG CTATA 89 230 GATCGATACG ACGTTCAAAG GATTCAAACC GCGCCATGGCTTCATCCAGT TTGCCGCTGT CAAGCTGACG ACGGACATCG CGGGAAGAAC TCGCCGCCTGATGACGCAGC ATCAGCGCCT GCGGGCGAGC GCGCGTGTTT CGCTGAGTTT GTTTTCCAGCGTCGCCAATC TCTTTCTTCA TGCGCGCAGT GTCATCACAG CGTGACTTCT GTTCAGCTAGCATAATCGTC 90 146 GATCCCATCG CTTTTTCAGA TATCATGCAC TTTTTGCACT CAATCTGCGGCAAATCCGAC CACTTTTTGC TCAGCCAGAA TGCAGTATTT CCGTCATACA TCGATTAGCTACGACTCTAC GAACTACCTC GACCACAAGA TCACCG 91 184 GATCTTTGTT AATAACAGTGAGAGAACCGT ACGAATGTAG AAGAACTCCC GCCAGGCGGC AACATCTTTC ATAGTAGACCAAGCGTTAAC CCCTGCTGAT GTAAAAACGC TTCTATCTCT TGCGCACCAC GGAACGGAAGGTTGCGCGCC TTTAGCGCTT ACGGCAATAG CCGCGGCGGA TGGG 92 311 GATCAAACACATGAATACCG AGGCCTTTGA GTTTTTCAGT CGAGGCGTCC GAGCTGGAGA CCGCGCCTTCAATCTGGCCT TTCATTGTGC CCAGCGCATC AATAAAGTCT GCGGCCGTTG AGCCTGTACCAACGCCCACA ATGGTGCCGG GCTGTACTAT CTGAAGTGCC GCCCATCCTA CCGCTTTTTTCAGTTCATCT GCGTCATAGA TCGTTAGAAT GTGTGTGAAA TACGCCGCAT TATAGAACATGTCCGGGAAA ATCTCGGTCG TACACAGCTA CGATTCGATT GCGCGCAATT TTGAGGGAAA A 93448 GATCCTCGAT TAGGGGAGGC GCTAATTGAA TGTGGCGAGG TGTAAGAAAG CAGAAAAGCAAAGTGGGTTC TCGTTGCTCT GCATGTCGTC AAATTCAATT AAACGCATAA AAAAACCCCGCCGGGCGTTT TTCTTCAACT TCCAGGCGAT TACGGCGAAC GAAGTCGATG TGAGTCAGCTTCGGTTTGTA AGCGTGACCG TGTACAGCCT GAGCTTTAAC TTTTACTTCT TTACCGTCAACAACGAGGGT CAGAACTTCG TGTAGAATTC AGCTTTAGCT TGCATGTTCA TCACCTGGTCGTGGTCAGTT CGATAGCAAT CGGGCTTCAG AACCGCGTAG ATGATTGCCG GACTGTAGCGCGCAGGCGGC AGCTCCTACA TGCTCTTACG TACTCTGCGT GATAGTAACA TTAATCTCTTATATCTGCAG ACTGCACGAG ACTCGTCG 94 359 GATCATATCG ACGGTATCGG CGTAATTATTTTGCAGATGG CGTAACACAT CCAGATTATC TCCGGTCAGA AAAAGATTAT GGCTGTTTTTATTTTCTGCC AGAGTATTGT GTTCCACGTC AGGAACGATA ACGGTAACGG ATTTTTCACCCGCCTGTTTT TTTGCCGTAA TCTTTGCCAA TAAAATCAAT CTGATAACCG CTAGTCAGCTCAATATTACG CGCTTTCAGG CGCTCAAATC TGGCGAGATC AATCCGCCTT TCGCGATCAGTTCGCCCTCT CGTTATAGCG GATCGCGGTA AAAATTCCGC GGTAATCGCA GTTGTAACTCAGACAGAAGC GCGTATTCGG CGCAGACGC 95 298 GATCCAGTTT AACCTCTGGC TGCCAAATCTTTCTGGAAAA CATGCGGTGC GTTTGGCGCT TCGAAAGAAA CATCCTGGTA TAGATACGTTGGATCTGGAA AGCCATTTCA GTGTTATTTT TGTTCTGACA TGTGTAAAAC CCTTTAGTGTTGTTCCTTAA ATACTTGAGT AACGCCTTAA CGCAACAGCG GATCCAGTCC ACCACGCGCATCCAGCGATA CAAGTCGTCA CAAGCGCAAT GTGCTGTGCC TCAATCAAAT TTGCGACGTCGTCGCACTAC GTTGATATCT TTACGTCA 96 217 GATCGTAAGA GTCAGAAATA AGCAGGCGTAATGTTGTCAT AGTGGTTTTC CTTACCTTTA TTAAGCCGTC ATTTTACTCT TTTTCCTCACGCTCTTCCTC TTCCGGAACA GGCTTGCTGG CCGTTAGCAG GAAGGGCGAC TGCTGCCAGCGGGTGCGTTT ACCTTGTAGC AAGGTGNNNC AGACACCACG CCTATCGCAG CGAGAGTAGCAGCATCA 97 335 GATCGAACTC TTTAAGCAGC ATCTTGGTAT GGAAAATATT TTCCTGATACACGTTTACAT CCACCATGTC ATACAGCGAC TTCATATCTT CCGACATAAA ATTCTGAATAGAATTAATCT CATGATCGAT AAAGTGCTTC ATACCGTTGA CGTCGCGTGT AAAGCCGCGCACGCGTAATC GATGGTGACG ATATCGGACT CTAGCTGGTG GATCAGGTAA TTGAGCGCTTTTAGCGTGAA ATCACCCCGC AGGTTGACAC TTCGATCGTC GGCGGAAAGG TGCATAGCCCGCCTTCCGAT CGCTTCGATA GGTATCGACG CAGATATGCT CTATG 98 352 GATCGTCGTAGCTGCCGGCA TTGTGGTTGG GTAAATACTG GCGGCAAAAC GAGACTACGC CAGCGTCTATCTCTACCATG GTGATGGTTT CGACGTTTTT ATGCCGGGTA ACTTCACGTA GCATTGCGCCGTCGCGCCGC CGATAATCAG AACGCTGTTT CGCATGACCG TCCGCCACAG CGGGGACATGGGTCATCATT TCATGATAAA TAAACTCGAC GCGTTCGGTC GGTCTGTACC AGCCGTCCAGCGCCATCACG CGGCCAAAAG CGGCTTTTCA AAGATGATTA AATCCTGGTG ATCGTTTTCATGATACAGAA CTTGTCTACG GCAAGTCATG ACCAAACTGG TC 99 127 GATCTGTTTCGGGAAGTGAA CTTAAGGCCT CCGCAATATC ATTTATATAA ACTGACATGG CATTTTTAAACTGCTCAGTA CTGCGTTTAC ATTTGTGGAA GATAGTCTCT GAGAGCAGAG TTTCTTT 100 345GATCGGCAAC CTGCATTGCC AGTTCGCGGG TTGGCGTCAG GATCAGAATG CGCGGCGGCCCCGATTTTTT ACGCGGAAAG TCGAGCAGGT GCTGCAACGC CGGCAGCAGA TATGCCGCCGTTTTACCGGT GCCTGTCGGC GCAGAACCGA GTACATCACG GCCATCGAGC GCAGGCGTAATGGCGGCGCT GAATGGCGTC GGGCGAGTGA AACCTTTATC CTGGAGGGCA TCCAGACAGGCTTTCGTCAG ATTCAAGTTC GGAAAAAGTG TTACAGTCAT GTCTACCTCT GTGTGGGCGCTGATTATAGA CTTACGCGCA TCTCATCTGT GATGATATCT CTCAG 101 250 GATCCGGGACATTCACGTTG AGAATACGCC CGGTACGCAA CGGCTCCCGG CTTAACCCTC GCAAAAGCGCACAAGTCACG GCCGCAGCGA TACATAATGC TGATAGCCGT TAAGGGAGAC CGCTAATGCCGGAAAGCCGA GATGACGACC TTCATCGCGC GCACAGTACC GGAATAGATC AACATCATCGCCAGATTCGG ACCGCGTTAT ACCGGAAACG ACATATCGGT GACGATTAGC TTACGCAGAT 102333 GATCCCGGCT TACGACGGTT GGCTGGATGA CGGTAAATAC TCATGGACTA AGCTGCCGACATTCTACGGC AAAACCGTCG AAGTCGGGCC GCTGGCGAAC ATGCTGTGTA AACTGGCTGCAGGTCGTGAA TCCACGCAGA CCAAGCTCAA TGAAATCATT GCGCTTTATC AGAAGCTGACCGGCAAAACG TCTTGGAAAT TGGCGCAACT TCACTCTACG TGGGTCGATA CATCGGGCGTACCGTTCACT GTTGTGAACT GCAAAACATA TTGCAGGATC ATACAGCTGA TTGTAATATCGGCAAGGATT ACACCAGTTT GAGACGGCAA TCG 103 284 GATCCAGCCA GACGGAACCCCACGGCGGCG GAGACGGCAG AGCGTAAGGG CCGATAAACA GACGCTGCCA GGCCTGTGCAACGACTCTTC GCTGTGGGTC TTAAACATAG CCGCCACAGG GCAAGGCTCG GCATCAAGCGGCCACTGCGC CTGCAGTCGT CGTTTAATAG TCGTCCTGGA CCAGAGGAGC GGTTTCGTGGCTTTCCGCGA ATAATAAAAC AAGTGCCAAG AACAGTGTTA CTGCAAATCA TCTCGTTGTAAAAAGTGTAT TAAACATCCG TAAA 104 249 GATCAACGCA AACAATCAGA ACCTCTGCTTCATTTAGCAG CGTGTTCTCT GCGTTGACAA TGCGTTGCGT GAAAACCAAA GCGGTGCCACGCATTGACGT AATTTCTGTT TGAGCTTCAA GCATATCGTC GAGCCGCGCA GGCCATAGTATTCCAGCTTC ATCTTGCGCA CCACAAAGGC TACCCGCTCC GCAGCAGCAC CTGTTGCTGAAGTGATGGTG GACGTCAGCA TCTCGNNNTC TTCATAAAA 105 248 GATCCCTTTA CGACCAGGCGTCCCGGCGCC GTTATAGTGC CAGCCAAAAC CAAAGCCGCC GCCCGGTAAA CCAATCTGTTCCAGCATTGC GGCCAGCACG ACGACCATCC ATGACCACTG TTCGCATGCT GCATACGTTGTACGACCAGC CAGCGATGAT TTCGGTTCTG TCGTCGCATC TGTGGCAACG CGACTGGGTGGTGTAATCAA GATCATTTCG CAGGACTTGG TGCATTGTAG AATCGAGA 106 175 GGCGGAGGATTGCCACGTNG CAGCCTGCTA CGCCCGTCAG TTCTTTACGC AGGTTAGCCA CCAGTTCGTTTACCATGTGG CGGCTCCNTG TCAGTTTCCA GTTACCCATC ACTAAAGGAT GTGATTTATTTNTCCACGTT AGTAGCGAAT TAAGGAAGAT GGCCGCTCGT AGAGA 107 307 GATCATTATCTTAACCTAAA ACCGCTATAT TTATAAGTAT TATTACGAAT AATCTTAACC TGGGATATGTTATACTAATC GGACCAGAAA GATATTATTA CGACTTTAGT AAATGCTTTT TAAATATTAAATAATAATTA ATTAAGATTT CTACCATTCA TTAATTATAC TTAACAATAG TTTCACACCCCGCGCCGGAA AGGTCTAACC TTCTCATTTA CCTTTAATAC TCAGTATTCC CGAATAGCCGACCGACACTA ATGATGAATG CTTATCTCTC ATAAACCAGA TATTATGACA CATAACC 108 234GATCAGGATA TGCCGCCGCC AGTAGCGATA GGGCGTCAAC CTCGTGCTTA TCGGTGATGAGCGGCGCGTT GGCCGGGGCT TTTAAAAACG AAAGCATTAT CCTTCCTTAA ACGTAACGCTGGGGCAACGA GACGCTCACC CGCGTACCGT GGGTACAAGA GATGGTTAGC GTCCGCCGAGCGACGACACG CGCTTCGCAT TCGGTCAGGC CGAAGCCTCT TGGTGAGACC GCCG 109 352GATCGAGCGC GGAGAACGGT TCATCCAGCA GCAGTACCGG CTGTTCGCGT ACCAGGCAGCGCGCCAGCTA CCCGCTGACG CTGGCCGCCG GACAGTTCGC CCGGTAAACG CGTCATCAGACTCTCAATGC CCATCTGATG TGCGATAGCT CCCGTTTTTC CCGCTGGCTG GCGTTGAGCGTTAACCCAGG GTTTAGCCCC AGACCGATAT TTTGCCTGCA CATTCAGGTG GCTGAATAAATTATTCTCCT GAAACAGCAT TGAGACCGGA CGGCGTGAGG GCGGCGTAAG CTATGATCGTCGGCAATAGT AGCGTACGCT GGCCAGGCGC AAGAAACCGC ATAATCTCTC TT 110 168GATCAGGGTC AGACGCTTGT GCGCCCATAC AACGTTTTGT TCCAGTTGGC CTTTCTCGTTAACGTTTTGG GAGCGCCAGA GCTGTTTAAC GCTCATGGGG CATTCCAGAA CGGGCAGTATCTCTTCAAAG GACGTTATCG TTTGTCAACG GCGGACAGCA TTTTCAAA 111 211 GATCTTCGGGGCGCACCCAC GGGGTTTTTG CGCGGGGGAC GCCTGTGTTA TCAGCATTGT AGAAACTGCGATAGATATTT CCGGTGAGGC AATTTTCGCT CGGCACGATG TGTCGCTTAT CCGGTATGTGGTGAGCAGTG TGCGCCGGGG CGTGTGATAG AGCCATTGCG CGATGGATCG TCTAGTGAGTTTCTCAGATA GGGGGTGACG A 112 257 GATCCGCAGA TCCATCTAAT CGGATTAGGCGCATACTGGT AAAGATTCAG CCCCCCCGCC AGCCCAATCG GATCCTGACT GACGAACCGTCCACACTCCG GTGCATAATA TCTGAACAGA TTGTAATGCA GCCTGTCTCG TCGTCAAAATACTGCCCCGG CAGCCGCAGA CCGGCTGGTG AAGTACGCCC GCTGTTGCTG ATGTCCGCCGCATTTCTCCA ACCCTGATAT ACCGCCACAC AGCGTCGTCG CGCGTAC 113 359 GATCCTGACTGGTACGACTT AACGTTTTAG GCTCGCCAAA ACTCAGCCCC GCCGCTTTCA TCGCTTCCGCGCCTTTGCCC GCTTTCAGCT CGACCAGCAG TTTTTCCGCA TCCAGCTTCG CCTGTTGTTCCGCTTTATTA TGCTTCACCA GGGCAGTGAC CTGTTCTTTC ACTTCTGCCA ACGGCTTCACGGCTTCAGGT TTATGTTCGC TCACGCGTAC GACAAAAGCC CGGTCAACCA TCCACGGTGATAATGTCTGA ATTCGGCCCG GCGTACCGTT TGCACAGACG CATAAGATAG CATCGGCTAACGTTGAAGTC AGCCTTCGGT AAGGTGTACG GCTAACAGCG GTTACGCTT 114 427 GATCGCGTACCGCCAGTAAC GCCGCCGCTT TACCGTCAAT CGCCAGCAGG ACCGGAGTCG AGCCTTGCGAGGCCTGCGCG GTGATTTCCG CCGTCATGTC ATCCGTGGCG ACGTGCTGTT CGTTCAGCAACGCCTGGTTC CCCAGAAGCA GTTGATGACC TTCCGCTTCA CCGCTGACGC CCAGTCCGCGCAGCTTCTGA AACCGTTCAC CTGCGGCAGT TTATCATCGC CGGCTTTTTC CAGAGAATCGCATGGGCCAG CGGGTGGCTG GAGCTTGTTC GAGCGCGGCA GCCAGACGTA ATGCCTGAGCTTCTCAACGC GTTAAAGGTT TTATCGCACA CTTGCGGCTT GCTCGTCAGC GTCCGGTTTATCAAACTGAG GTATCAACGT ACTGGCGCGT GCAGGATGGC ATGTACAGAG CGATGAG 115 299GATCTGGAGG TAGAGGTTAT CGAGGCCAGC GGTAAAACCT CACGTTTCAC CGTGCCTTATTCTTCCGAGC CGGATTCGGT TCGCCCCGGT AACTGGCACT ATTCGCTGGC CTTCGGCAGGGTTCGTCAGT ACTACGATAT TGAAAATCGT TTCTTTGAGG GAACGTTCCA GCACGGCGTTAATAACACCA TTACCCTCAA CCTCGGTTCA CGAATTGCGC ACGGTTACCA GGCATGGCTGGCGGGCGGCG TCTGGGCCAC CGGTATGGGC GCGTTCGGCC TTAACGTCAC CTGGTCGAA 116 339GATCAGAGTA AAACCTGGCT GCTATGGTGC GAACGTGGCG TAATGAGTCG CCTGCAGGCCTCTATCTGCG CGACGAGGGG TTTGCCAATG TGAAGGTGTA TCGTCCGTAA TTCCTTTGCCGGGTGGCGGC TATGTCCTAC CCGGCCTATC GTTTTATTTC TGCCCCAACC GTTTTGCAATGCGCTCCAGC TTCATCATCA GCAGCAGCGT AATGGCCACC AGCACAATGG TCAGCGCGGCGTCAGCATAT TTCACGTCGG TCAAGCTAAA GATAGCCACC GGCAGCGTCG TCAGCCGGCGATAATCATCA TCGTGGCCAA CTCCCATGAG AGCATAACT 117 378 GATCGATATC AGGGAGGAAGTGGTTGCCCG CCACCAGCGT ATCGGTACTG ATCGCCAGGG TCTGCTTTTC AGGAATATCAGGAGCGCGCA ATCGTCGCCA ATACCGGTTT CAACATCAAG ACGAGAGCTT CTTACACGGTCAAAATAACG GGCAATCAGG GAAAACTCGC CACATGCCAT ACGTTATGCC TCAGCAGAAAAAAAGAAAAG GCCGGAGACG CGGGTATCGA GCGCCCGCTA TCTTTCCGGC CTGTGAATCACTTTTTGTTG GGACGAATCA CCGGAGCTGC TTTATCAGTA CGCGTTGACG ATTTGTGGCTGTCTTCACGC GCCAAAGTTT GAGTTCATCG CTTCGTTGAT GGCCATTATA AGCCAATC 118 266GATCTCTTAC GATAAAGAGC ACATTATCAA CCTTGGCGCG CCAGATTGGT ACGGAAGATTTTGCCCGTGC GATGCCTGAA TACTGTGGCG TGATTTCAAA AAGTCCGACG GTGAAAGCCATTAAAGCGAA AATTGAAGCC GAAGAAGAAA ACTTCGACTT CAGTATTCTC GATAAGGTGGTAGAAGAGGC GAACAACGTC GATATTCGTG AAATCGCCAG CAGACCCAGC AGGAGGTGGTGGAGTAGAAC GTGATGATCG GTTTCT 119 345 GATCATCTTC CACTTCCAGA TGCACCGTCACATCCGGGTT AGTGAGCTTC ACGCGCGCCG ATTCAATATG CTGATTTAAT CCGCCGCCAACATAGCGCTC CACTTCAATG GAGCTAAACT CATGCTTACC GCGACGTTTT ACCCGCACGCAGAAGGTTTT GCCTTCAAGC TGTTCGCGAT ACTGCGCCAA ACGCTTTCTC GAAAATGTCGTGCATATCGG TGAACGGCAC ATCTCGACTT CAAGAATATG TGAATCCCGG GATCGTGGTCAGCGCTCGGA ATCACAGACG CTGGTTTCAC TTGCGCGACT CATTTACAGT CAGACACGTGTAGTGCTTAA CTCAG 120 321 GATCATCCTG GAGGTCTTTA TGGCTGATTT CACTCTCTCAAAATCGCTGT TCAGCGGGAA GCATCGAGAA ACCTCCTCTA CGCCCGGAAA TATTGCTTACGCCATATTTG TACTGTTTTG CTTCTGGGCC GGAGCGCAAC TCTTAAACCT GCTGGTTCATGCGCCGGGCA TCTATGAGCA TCTGATGCAG GTACAGGATA CAGGTCGACC GCGGGTAGAGATTGGGCTGG GCGACGGACG ATTTTGGCTG GTCCTTCTCA GGCGCTATTA GTACGCGGTTCATGCAGTAC ATACTACCTG AAGTCACGAT GCACCGAATA G 121 216 GATCGGCGCGCGTATCTCAG GCATGTGCGC CGCCAGTTGG GAAACGCGCC CGCCGGGGCC CTCAATTTCATACGCAGAAT ATCCGCGCGC GCCGACCGCG CCGGCAACGG CGCGGCAGAC ATTGACGCCGGCGGGCAGCT CGCGGGCTGT GGCAGAAGGG CGTCACGCTG CCAGGCCTCG TCTGGATAGATTGATATTCT CGACCACATC CCGAAA 122 292 GATCGGCAAA CAGATAGTCC TGCGACGCATTAAATCCAGG CATTGCCGAG GAGCACGCCG AAGCGGATAC GCCAGGCGGG CAGGCCATACCTACGGTATT TGTCAGACCA AACGCCTGCG GGTTGGCAAG AATTTCCTTA AAGAGGCCGTTGATATCGGC ACGGGCTATA TTGCCGCCGT GTTGCTCCAG CCCCTTCTCT TCCATCTGATTATAATAATC GGTCAGAGCT GACGCTGCCC TGCCGCCGTT CATAGTTGCA GAGTGTCACGAGCAGTGTGA TAATGATGGG TT 123 109 GATCAGCGCC GCGCTACGTT AATAGCCGGTTGCGACGACC GTGGACGCTA GCAGAGTCGC GGATGACTTC CGTATCGGTT GGTCCACGCGTGAAATTAGT TGCGCGACA 124 258 GATCGGTCGC ACGCCGGAAT ATCTGGGGAA AAAAATCGGCGTGCGTGAAA TGAAAATGAC CGCGCTGGCG ATTCTGGTCA CGCCGATGCT GGTCTTGTTGGGTTCGGCCT GGCGATGATG AACGGATGCC GGACGCAGCG CAATGCTGAA CCCTGGCCGCACGGTTTTAG CGAAGTGCTA TATGCCGTCT TCCTCTGCCG CCAACAACAA CGTAGATTTTTAGTCTACCT AACTACTTCT GAACTACGGC ATCTCGAC 125 384 GATCGTTGGT CTTTAAGGCCGCCGCCAAAT CGCTGTCGAC CTGCTTGTTG CTGTAAAAAG CGGTATTAAA CTGCGTCGGCGGCCAGTTTT GTGATGCGAA GAGCGGCGAT AACGCCCAGT CAGCTTCGCC CGTCAGACGCCGACCAGCCT GTATAGAACA TTCGCACGCG CTCTCTTTTT GCCCTTTGCC CTCGACTTCCGCGGCGGCTG GCCGGCGTAC ATCGCGGTTA TCCGGGCTTT AACGACCAAT CTGCGCCAGTTGCTGTTGGG TAAACTGCAA GAGTTTTTGG GTGCTATGGT TGTGCATGAC ACAGCGTGTACTGAACGTCT GATACCGCTT TCACGTCCCC TAGCGATCAT GGCCAGTGAA GTTGCATAGC TAGA126 448 GATCATACCT TGCTTGATGA CTGCGCCACT AAAAACCTGA CGCCGGCGAAAACCCACTGG GCGCGCCCGC TTGATGCGCC GCCCTACTAC GGTTATGCGC TGCGACCCGGCATCACGTTT ACCTACCTGG GTCTGAAAGT CAATGAACGT GCCGCGGTGC ATTTGCCGGTCATCAAGCCG CAACCTGTTT GTTGCCGGCG AGATGATGGC AGGAAATGTT CTGGGCAAGGGGTATACCGC AGCGTAGGCA TGTCTATCGG CACAACCTTT GGCCGCATTG CAATAGAAGCCGCCCGCGCA CAAGGAGGCG CACGATGAAA CAGCTTGAAA ATTATCATTG AGGCACGTGCTTACGAACGA AGCGAGGTGA ACTGTCATGC AGTGTGTACG TGTGTGCTAC TCGAAGGTTTGCGGATTCGC ATGACAGGTG ATGTAGCGAT ATATCGAT 127 392 GATCCCCAGG AGGTCTGGTTTGTCAAATCG CCGAAATCCT TTTTAGGCGC CACGGGCCTG AAACCGCAGC AGGTCGCGCTGTTTGAAGAT TTAGTCTGCG CCATGATGGT ACATATTCGT CATACGGCGC ACAGCCAATTGCCGGACCGA TTACCCAGGC AGTGATCTGC AGGTGGCACT TTTCGGGGAA ATGTGCGCGAACCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA TCGCTCATGA GACAATAACCTGACAAATGC TTCAATAATA TTGAAAAGGA AGAGTATGAG TATTCAACAT TTCGTGTCGCTTATCCTTTT TCGCATTTGC TTCCTGTTTG CTCACCAGAA CGCTGGTGAA GTAAAGATGCCTGAAGATCA GT 128 327 GATCTTGTCA AGCTGGTCAG CATATCCCGG ATATCCTCCGCCTCCCCCCC CGCCACTCCG CGCGGCTTAT GAATCATCAT CATGGCGTTT TCCGGCATAATGACGGGATT ACCTACCATC GCAATAGCGG ATGCCATTGA GCAGGCCATT CCATCGATATACACCGTTTT TTTCGCCGGA TGATTTTTCA GGAGGTTATA AATGGCTATT CCGTCCAGTACTGCTCCGCC AGTGAATGAA TATGCAGATT TATACGGTTA ATCTGTCCAG TGCAGCCAGTTCTCTGCAAA CCAGCGAGCC GAAATTCCCA TCTCAATCTG TCATAAT 129 306 GATCCGCAGGAGAAAACACG ATTGTACAAA GAGGCGCAGG ATATTATCTG GAAAGAGTCG CCCTGGATACCGTTGGTGGT GGAGAAATTG GTTTCTGCTC ACAGTAAAAA TTTGACCGGT TTCTGGATTATGCCGGATAC CGGTTTCAGC TTTGACGATG CGGATTTAAG TAAGTAATGC GATGGGGCTGGATGGCGCGC GGTTGTCGCC ATCCGTAAAA GGTTCGTGTA TGCTAACTAT GTTCTCAGCGCTGCTGGATT ATTCTACGTG TTGATTGTGC AGTGCTGGTG TTTATTGTCA TTGTCC 130 301GATCTCAGCG ATGTTCAGTT AAACGCTGTG CCGGATGCGG CGTAAACGTC TTACCCTGCCAACGGGTTGG GTAAGCCGAA TAAGCGCCGC TCCATCCGGC AGCATTCACA TAAAGTCCGGCACCAGACGC TGTAACGCGC CTTGCGCAGC AGCGCCGTCG CACACTCAAT ATCGGGCGCGAAAAAACGAT CCTGCGTATA GTGCGCCTCC TGCTCGCGCA GTGTCTGCCG CGCCTGTTCCAGTAACGGGC TGGAGGTTAA CCTTCCGTAA TTATCCTGAC AGCAGCAGCA TCACGCATAT G 131329 GATCGCCGGT CAGTTCCTCC ATTAAGAGCG GCGCGCGCGC CAGCATCTCC ATGCAGAAGAGCCGCGACGC CTGCGGATAA TCACGCGAAA CTTCCAGCTT GAGACGGATA TACTCTTTGATGGCCTCCAT AGGGGAAAAT TCTGCGCGAA ACGCTTGAGC GGCGCACGAG ACATCCAGAATCTCGTCGCA TTACCGCGAC ATACAGCGCC TCTTTCGAGG GATAATAATA AAGCAGATTGGTTTGGAGAC GCTGCCGTAG CGGCGACTGC TCAAGACGCG CGATGATGCA TACTGGAAACACGAGCGCGT AGATAGCTGC GTTGCACGG 132 266 GATCCGCCCA CGCGTTAAGG GCCGTAAACAGAGCGTCATT CATCATTACC GCTGGATTCA CCGCCCTTCG TTCTTCTTCT GTTAACACCACGCGTAATCG CAGACAGGCC GGGCCGCCGC CGTTGGCCAT ACTTTCTCGC AAATCAAACACCTGCATCGC GCTGATGGGG TTATCCTCCG CCACCAGCTT ATTCAGATAG CGTCCAGACGCGACATGGTC TGACTTCCGC GCACCTACGC TTGAGCCGTG TTCGCTTGCA CTGCTT 133 319GATCAAATGC AGGCAGTAAA AGGGCGTCAT CAAGATTATC GGTACACTGT GTAGCGGCGGTTTGCAGAGT ACCATGTAGC GCCGGATAAT TATGCCGGGT CAGGTTGACA CCGTGCGTACCGTTAATAGC TTCAAAGGCG TCGCAAAACG CGCGGTGTTT TTCTGCGGTG ACGGGGTCTCCCGGCGCTTC AAAAGTTCGC ATCAAATGCG GGCGATGCTC TGATTCTGGT ACTTATCGTACAAAACGACG ATCGCTCTCT CATGATATAC GCATATAGCA TCATGCCTGT CCGTGCATAGTCGTAACTAG AGACATCAC 134 438 GATCAACCTG AACTCAACGG ACCCTGTACC GTCTAAAACGCCCTTAGCGT GAGTGATGCG GATTCGTATA ACAAAAAAGG CACCGTCACC GTTTATGACAGCCAGGGTAA TGCCCATGAC ATGAACGTCT ATTTTGTGAA AACCAAAGAT AATGAATGGGCCGTGTACAC CCATGACAGC AGCGATCCTG CAGCCACTGC GCCAACAACG GCGTCCACTACGCTGAAATT CAATGAAAAC GGGATTCTGG AGTCTGGCGG TACGGTGAAC ATCACCACCGGTACGATTAA TGGCGGAGCC ACCTTCTCCT CAGCTTCTTA CTCATGCAGC AGACACGGGCTATACATGGA CATCAAACGG CTATAGGGGA CTGTGAGCTA CAGATTACAC TGATGGCACGTGTTGGCACT ACACGCGCGT TCGGCGATGT GTATGAAC 135 363 GATCTTATCC TTCCGCTACAAAATCAACTG CGCCATCTGA CGCATATTGT CGGCGTGGAT AAACTGGCGG CTGCCACCACAGCGCTTGCG TTAGTCAAAT CATCGACCGC AGCGAACCGT TGCAGTCAGA CATTAACATTCACGGTGATG AACTGGCGGC AGTGCTGTTT ACCTCCGGCA CAGAAGGAAT GCCGAAAGGGTGATGTTGAC CCACAATAAT ATTCTTGCCA GCGAACGGGC GTATTGGGGG TTGAATTTAACCTGGCAAGA TGTGTTCCTG ATGCTGGCGC ACTGGGAGAC CGGATTTTAA GGAGGCTTTTATGGGGTAGT ATTGCTGGAC ATCTTACCAG AGCTCTACTA TAG 136 347 GATCGATTTTCCCCTCCATG TTTTCATAGG GGAACAGGTT CGGGTTAAAA ACCACCTGAC GGATATCGCACAAAAAGCCA ATCCGCTCCG CCCAGTAACC GCCCAGCCCC ACGCCACAGA TTAAAGGGCGCTCGTCCACA TTCAACTGCA ACATTTTGTC CACTTCTTTC AGCAGATGCT GCATATCGTGCTTAGGATGC CGCGTACTGT AGCTTACCAG CCGAACATCG GGTCGATAAA CTGGTAATTGCGAACACTTT TTCATGGTGC GCGGACTATA TGAGTCAAAA CGTGTGATAT ATATCATCTGGCACCTCACG AGACTGAGTG ATGCGTGCGT TTCTGCA 137 278 GATCCCAGAC AATACCGTTACTGTTATCCA ACGATACCCC TGCCAGTGAG GTACGCAGGA ATCCATATTG GGTGTGATGCGCGTAAGAAA CGCCCGCCAT CATAGTACTT TTACGCCTGT CCAGACGACG CAACTGATGGTCATCGCTGT CGCCCGGTTT GAAGTACATC GGGGACCAGT ATGCCATGAT TGACAACTTATCGGCATTGT CATTCACAAG TAGTACCGCG CCAGACACGA CAGAGTTNTT CATAGGCATGACGATCGATA ACAGCTAT 138 385 GATCGTTATG AATCGCTTGC GTGATTTCCA GCGTCACCGGGTCGAGACGA TAAACTACGC CGCCTTTATC CAGTTTACGG CTTTGCGATG TAGCCAGCCAGAGCGCGTTT TCTTGCTGAC TCCAGGCCAT CTCATAACGC CTTTGCCTAC CGCTTTACGCAGCATGTCTT CCGCGCCAGC GTGCTAAATG AGGATGCGAC GAGGAGCGAA CCTAACAATAAAGAACCACG CAGGCTGGCG AAAAAAGATG ACGTAAGTGC ATGACGACTC CTTTGATAAAACGTGTATAG CTGCTTCACA CTACTTCGCT GCGTGGATCT GCAGGTGGCA CTTTTCGGGAAGTGCGCGAC CCTATTGTAT TTCTAATACT CAATATGATC GTTAT 139 282 GATCAGCGGCTATGGCGGTC CGGAAGGCGC GAAGATGGCA CGCCGGCGGG CACAGTTTGG TTTGCCTGGAATATTAACAA TACAACTTTT ACAAGCCGAC AACATTTCAA CGGAGATTGT CAGGAAGTATTGGAAAAATG CGTACGCTTC GCCCTCGCTG AATTGCTTTT CTGTTAACGA AGAAAGCATAACATAATTTC ACTGACGTCA GATACTCCGG CTAGATAAAT CGAGCTTACC GCGTGTTCGGAATTCGATGA TTCGGATATC GGTCGCCATC GT 140 179 GATCGGCGAC TACAAAACCAATCACCGCGG CTTTACCATC GAGTTCCATA TGCGTACGTT TTATCGCTGG GAGTATGGCGAGAATATGTC CCCGGCCGGA TAGAACCGGT TAAAGAGACC ATGCGTTACT TTTTCATGGCGGTATACATG CACAGTTGCT TGGTGGCATG ACATTGGAA 141 261 GATCAGTAAC AGGACGGTAGCAAAATTCGC ACTGAGCCCG GCGACATTCT GAACGAACGG TTCAATATAG CTATAACTGTGTAATGCGCA GTCACCACAA CGACGGTCAG TACATAGAGG CTCATCAGCG CCGGGCGTCTGAATAGCAAA AGGTAAACTT TTTAGTGAGC CGGAATGCTC GTCTGGCAAT TTCGGTAGAGCTTATCAGAA TAGCAGCGTA TATCTCCATG CGATGCAAAG TGGCCCAGCA AATCTGACAC T 142225 GATCATTTTG GTGCCGGTGT CAGCCTGCTG ATGTCCACTG GTCAGCGCAA CGGAATAGAACTCGCCGATA TAATTATCAC CGCGCAGAAT GCAGCTCGGG TATTTCCAGG TAATCGCCGAACCGGTTTCC GACTGGGTCA ACGACATCTT GCTGTTTTCC CTTCGCACAA GCCCGCTTGGTCACAAAGTT CAGATCGCCG TGTGTGTGCC GGACAGTTGA CGTGA 143 301 GATCATCCTCGGCGCGGGAG TGAATCACTG GTATCACATG GATATGAATT ACCGTGGGAT GATTAACATGCTGGTGTTCT GCGGCTGTGT TGGACAAACC GGCGGCGGCT GGCCGCACTA TGTCGGCCAGGAGAAGCTGC GGCCGCAAAC CGGCTGGCTG CCGCTGGCTT CGCGCTGGAC TGGAATCGCCGCCGCTCAGA TGAACAGTAC TCGTTTTCTA CACCATGCCA GCCAGTGGCC TATGAAACTGACTGCGCAAG AGTTGCTGTG CGCTGCGATC GCTAATTCGA CTATCGATTA C 144 272GATCATGTGG GTTTAACCCG TTGATTAAAC ATTGGATTAC GGAATAGCAA TTGCTTATTTTATTTGTCAT ACAAATAAGT ATAATACCCG CTTCCGATGT AGACCCGTCC TCCTTCGCCTGCGTCACGGG TCCTGGTTAT ACGCAGGCGT TTCTGTATGG AATACGCCAT CCCCTCTGATAGATGCCTTG TTGCCTTAAG CAGTTAACCC GCCTGAAGCA AACGACAAGA CGGCAGACGCTTACCGGCAT ACGACACGGA TGCTTCAGAA GA 145 358 GATCTGCGCA CATCATTCGGGTCATCGCTA AATTTTTCAC TTTTAATTCG CCGTCCGACA GTTTTCCTTC GCCGGTGAATTGATTGCACA TTTTGCCGGA TACCGTCATG TCCTCGCCAA GGCTAGAGCT CCGGGCCGGTGACCGTTTTA CCGTTTACGC TTTCCAGAAC AAAGCGGTGG TGCTCCAGTT CGTCGCGTTTGACGGACACT TTTCACTGCT CACACACCTG TCATTATGAT GCTCAGGGCG ACCAGCGTGATTTCTTCATT GATATTCTCT GTAATCTGAT AGGTTAACAC TGACTATAGT AATGATATGACCGGATAGAT CTTCAGGGTA TCCGAAAATC GTCCCTGA 146 224 GATCTGTTGT TACAGCATGGAATGCGCCGT CCTCCTCACC GGCCAGGCAA ACGGCGCGAT CGTATCGAAC TGTGCGCCGCGCCGAAAGAA GGGGGGCTTA GCCCTTCTTT CGGCGTCTTA CGCAGCGTAG CCAGCATATTAGCATTGCCT AACTGCATTA TTGTCTGCGG CGGGGATTTT ACTACGTAGC GCAATTTGGCACGTCTAGAA ATTCGTAAAG GTTC 147 268 GATCCTGAAT CGCCACGACA CGGGCGCCAGGCCTGCAAAC AGACGCGCGG CTTCGCTGCC GACGTTACCA AAACCCTGAA CCGCAACGCGAGCGCCTTCA ACAGCAATAT TCGCCCGACG TGCGGCTTCC AGCCCGCTGA CGAAAACGCCGCGCCCCGTC GCTTTTTCAC GGCCCAGCGA ACCGCCAAGA TGGATAGGCT TACCGGTGACGTAAGATAGT GACCGTGTGC ATGATTCATG GAATACGTAT CATATCATCA ATATTACT 148 314GATCCTGAAA AATACCAATT TTCAGCGGGC GAGCTTCGCC TTCCGCACTA AAACAGTGAGGAAAACGCTC GGCCAGAAAC GCGATAACTT CTTTACTGCT ATTCAACTTA GGTTGATTTTCCATGAAATT TCCTGATTAC AACGGACGTA GCCAACAAGC AGCAGGCATG AACAGGCGTCATTATAATGA CGCCATCAGT AATTGCTACG TTATCCGTTG ATTATCCTGC GACGTCGCAAAGATTTTTTG TATCCGTCGT GCAGCACGTT CAGCTGTCAC CAGCGTACCA GGCGTGTCATCTCTCGTAAC GCAA 149 379 GATCCAGAAT ATATAAAACC CCATTAACNC CAGCGCGCTTAATAACCATG TGGTCATCTG CGCTCCGTGG CTGGTTACGT TGTTATAAAT AAGGATGGCGACCAGCCCAA CGAAGATAAC GCTGTCTACG CGACCGCGGC GGAGAGGGCT ATAGAAAGCAGAGTGGGGCC ATTGCGACGG GGCATGATGA ACTGATCGTA GAGAGCGTAA GCCAATAATTCGGCAATAAA GAGAATCAGC ACCAGGTCCG TGATAGTCAT TTATCTCAGA GAAATAAAAAACGGGCGTTT GCGTAGTGTA CAACAGCCTT ACTGGCCAGC AGTCTACGAG TAGCCGGCGATACCAATGAC GAGAGCCACG ATATCACAGC GTACTTCTA 150 355 GATCCAACAA GCGGCTGGCGCCATAGCCGC CGCGAACCGG CATGACGATT GTATCCGGCG ACGTTAGCGA GGCCAGCGAATTAACATCGG CCAGCCGTTC CGCGTCCGTA CCGGCAAAAC GCTGAAAGGG CGACGAATCACCTCGTCATT CTCCACCTGA TGACCCGCGT CAGTCAGGCG CTGAACGCCG CGTAACGGCTGTTGGTTAAT ACAGTAGCCC GACTGGGCGA TTAATGAAAC AGAGACATGG TAATTCCTTGCTGACAATAG AATCGAATGT ATATCATGCG CATATATAGG CGATGTCTCG TGTCGCAGTTCTGATCGGAC AGGAGGCACT AGCTCGGGGT ACTTT 151 278 GATCCTTATT CCCGATGTGTTCACCTTTAA TATTCTCCAC TCGCGCGTGG AGGAGATGAG CGGCGTTCCG GTCGTTCCGCTATATGACAC GCCGCTATCA GGGATTAACC GTCTGCTTAA ACGGGCAGAA GATATCGTGCTGGCGTCGCT GATTCTGCTG CTCATCTCAC CGGTACTGTG CTGCATTGCG CTGGCGGTCAATTGAGCTCG CCGGGCCGTG ATTTGCCGCA GACGCTACGG ATGGCAGGCA AGCGATCAAGCTGAAGTCGT CATAGGAG 152 394 GATCAAAATA AAACTTTAAT CCCACTGGGG CAAGAGAGTGATGTGGTGAC GCTCAGTCCG GGTCAGGCGT CGGCGCATCT GCAATTTTAC GCGCGTTATCTTGCCGATGG CGGCGCGGTA ACGCCGGGGA CGCCAATGCC TCCGCAACCT TCATTCTTGCCTATGAATAA GTTCTTTTTA CGCTGCGCGC ATATATTGGT GCTTGCTTCC CATATCATGGGCGCAGGCTG GCGTGGTAAT TGGCGGTACT CGCTTTATCT ATCATGCGGG CGCCCGGCATTAAGCGTACC GGTAAGTAAC CGTTCAGAAG TCGTTCTGTT AATTGATACG CATATTTACTGGTGGGTCGG TTACGGAACA AAACGATGGA TATAGTCCTG TGTAGTGATA TGCT 153 324GATCGTTAGC AAGGTTTGCT GCGTCATCTG CTGGGTTTCA CGCAATGTGT GCGCGTTAAGCATCACAAAA TGGCTGGCGC GCGTCGCCCA GTGGGCATTG ATTTGTAATT CAAGCATACAAACCAGGTTG CGGTTGATGG TCTGAATGGC CTCGAAAATA GATTTTTGTA TCCGGGTTTCTTTACTGGCA GGCGTTATCA GCCCGCGCAT TTTGACGACA TCGTTCAGCA ACCGTTGCAAATGTTATCCA ACCGGGGAGT CAGCAATCGC GACAGCTGCC TTGATACCCA GTTACCTGACCGATCCGGAT GATCCGATCG GAAA 154 308 GATGGCTGGG AAGACGGGTG CCGTTCTGGTTAAGCGTATT CAGCTCTTCG CGCGGGAAAT AGCCTTTAAT CGCCAGGGTA CTGTACAACGCGGGGCCCGC ATGGCCTTTC GACAGTACGA AGTAATCGCG TTCCGGCCAG TCCGGGTCGGAGGGTCGATT TTCATCACCG CGCCGTACAG AACCGCCAGA GTCTCCACTA CCGACATGCTGCCGCCATAG TGACCAAAAG CCAAAGATGG TTTAAGGATT TGACGGTGGA CCGAATATCGACAGTTGGGT GATTTCGGTT ACGTTCATTC TTCCTGAA 155 333 GATCGTGGTC CAGCTTATGAACGGTATAAC TGAGGGCGGA CGGCGTTTTA AATAATTTTG CCGACGCCGC CGCGAACGTGCCTTCTTTTT CTAACGCATC AAGAATAATC AGAACGTCCA GCAGTGGTTT CATACTCGTCCCCTTGCCGC TATATGGCGA CCACCTGCTG GACAGCGACT CACTCCATCG GCATCACCAACGGATCGGGA TATTGATATT CAAATCCCAG CTCATTACAA ATCGGCTACC GTCGATAATCTTCCCTTTTG CCGTTGTCGG TGGTACGAAA ATCGCGGCGG CGATTCCCAG CAAGCGTATTGCGATAAACA CTG 156 334 GATCCACCCA CGTCATCAGT TGTTCAAAAC CCTGCTTCACGGTGTGTTCC CATGGACCGA CCATGTGGAA AGCGGCTATC TTGCGTTTTT GTGGCTGCCTGATTTCGTAA TCCATGCTGC CTCCGTCACT TCACAATGCT GTATGAATGT ACAGTATAATTACAGCCTTT TACGGTCACA AGGACAGCGT GATCATTTTG TGAGCAACCT CGCAATCCCGCCCTTTTGAC ACCTCAGATG ACGGTGAACG GTGTGTGTGA CAACGGCTTA CGCTTTATGTGAAAATAGTC GTCAGACGAG AGAACATACC GCCTTTACCA CGATTCAGAG TGAC 157 152CGTTTGCTAT CGACCTGCAG ATCGGAACGG ATTGGCGTCA CGTGATGGAT AAGACCGTGTTCTTCAATGT TATCTCGGCG ACACGAGCGC ATCCGGCGAA ATATCGACCG CATCAACCTCTGCGTCGGGA AAGCATAACA CAGGCATGGC AT 158 204 GATCGAACGC GCGTTGCAGCAGCGCCCGGC TATTTTCTAC CCGTGTCGTA TCGCCGAAGT TGTGCCATAA CCCCAGCGAAATAGCGGGAA GTTTGACGCC GCTGCGTCCG CAGCACGATA CTCCATTGTG TGATAACGATTCTCATCGGG CTGATAAATC ATGACCTTTC CCCTGTGGCG AGAATAATAT GTGTACGGTT ACTC159 283 GATCTTACCG AGTGGGAAAC TAATCCGCAA TCGACCCGCT ATCTGACGTTTCTCAAAGGT CGGGTAGGGC GCAAGGTCCG CTGACTTCTT TATGGATTTC CTCGGCGCCACGGAAGGGTT GAACGCCAAA GCGCAGAATC GCGGCCTGTT GCAGGCAGTG GATGATTTCACCGCAGAAGC GCAGTTGGAT AAAGCGGAAC GTCAGAACGT GCGCCACGAG GTGTACAGCTACTGCAATGA GCAATTACAG AGGGAGAATG AGCTGGATCG CTGTCTAAGA GCT 160 302GATCGCGTTC GCCAGGCAAA ATATTACCGT GCTCAAGAAT ACCGCTGCGC ACGGCATCCTTTACCGTCTG GGCGAATTTC ATGTATAGCG GCGTATTATC CGCCGCTGAA ATTCGTTCATTCAGTTGCGC GATGAGCCGG GTATGCGCTT GTTCCATTTA TCTTTCCTGA CGACGGGTCTGTAGGCAGTA TACTACCACC ACGCGTGGAA ATGATGTACC GGACCAATGC CCTTCCCCACTTCCAGCCGT GTACGCTGGC AGCGCCGAAG CATGCCTTGC TCGTTTACCG TCTCTCCCAA CT 161233 GATCCTGAAT GAAAATCTCA CTGCTCGGCT TGTTGGTCAG TTCGGCCATG GTCTGGCGCACGTGCTCCAG CATGCCGCCG ATATTGGTCC CGGCCTCGCC GTGACGTTGT CGAGCTTGCCGCAACCGTCC ACCGCTTTGC TGATGGCTTC GGACGCCGGC GGCAACATCC ACACAGCGCACCGAGACCCT GAGCCTGACG CTACCGGATC CGGCGGTATG AGCGGTTAGC GAG 162 236GATCTGTTCC GTCTGACGGC GGGTAAACTG ACCGGCCTGG ACCGAATGGG GCCAAAGTCCGCGCAAAATG TTGTTAACGC GCTGGAAAAA TCCAAAACGA CGACCTTTGC GCGTTTTCTCTATGCGCTGG GCATCCGTGA AGTGGGTGAA GTGACGGCGG CGGGGCTGGC GGCTTATTTCGGTACGCTGG AGGCGCTGCA GGCCTCCGAC CATTGACGAG TTCGAGAAGT ACTACT 163 334GATCGCGTGT CGGTGCGTGA TTTAAGCCGT GGCTTAATCG TGGATTCCGG TAACGATGCCTGTGTGGCGC TGGCGGATTA TATCGCGGGC GGGCAGCCGC AGTTTGTGGC GATGATGAACAGCTATGTGA AAAAACTCAA TTTACAGGAT ACCCATTTTG AAACCGTCCA CGGTCTTGGATGCGCCGGGA CAACATAGCT CCGCGTATGA CCTGGCGTAC TCTACGGCGA TTATTCACCGGCCGAAGCCT TGAATTTATC ACATGTACAC GAGAAAAGCC TTGACCTTGA ACCGATTAGAGCAGAACCGA ACGCTTGATG GATAGACACG AATG 164 308 GATCGTAGTG GAGAGTGTCGCCGAACGTCT GGTGCAGCAA ATGCAAACCT TCGGCGCGCT GCTGTTAAGC CCTGCCGATACCGACAAACT CCGCGCCGTC TGCCTGCCTG AAGGCCAGGC GAATAAAAAA CTGGTCGGCAAGAGCCCATC GGCCATGCTG GAAGCCGCCG GGATCGTCTG TCCCTGCAAA AGCGCCGCGTCTGCTGATTG CGCTGGTTAA CGTCTGACGA TCCGTGGGTA CCAGCGAACA GTTGATTGCCGATGCTGCCA GTGTAAAGTC AGCGATTCGA TAGTGTGTGG CGCCTGAG 165 362 GATCCCATCGCGAATATCGG TAAAACAGCG CTTCTGCTGA CCGCCGTCGA TAAGCTTGAT CGGCGTTCCTTCTACCAGGT TCAGAATCAA CTGCGTTATC GCGCGTGAAC TGCCGATACG CGCCGCGTTCAGGCTATCCA GCCGCGGCCC CATCCAGTTA AAGGGACGGA AAAGCGTGAA GCCAATCCCTCTTTTTGCCA TAAGCCCAAA TCACCCGTCG AGAAGCTGTT TGGAAACGGA GTAAATCAGGGCTTATTCAC CGGCCCGACG ATCAGATTGA TTGTGTTGTA AAGAGGCTCT AATCGGTCACATTAGAGAGA GGAAACATTT AGTATTAGAT AAGATACCGA GTTTAATAGT AA 166 71ATCGCGTTGT GTTGCCGAGC ATTTATTACA AGGCGCTTCT GTGTGNCNCT CGAATGGTGCNGCAAGACTG C 167 363 GATCGTGTCG CAATTCTTAA TGCCATAGAG GGTAATCATATTGAATCCTT TAACGCGAAA TTCGAATAAA TAATCAATAG TATCGTCTGC GGGATAATAAGTGTGGCCGT TTATGGTTAT TTATCCAGCG CTGATCGGCA ATCAATATAA CATTGTTGAGTGAATGTGAA TAATGATTCC TTTTCGTTCC AGATGTGGCT TGTTTATACT TCGCCGGTATAATCCTATTT GGGCAAATGC AATTGTGTTT ACCATTGATA AGGTAGGTAG GAAAGGTATATGTGCTAATA TGGCGTAGTC ACATAATTAG TCTACGGCCA TGATCAGACG CAACAGGATCGACTCGTATG ACTTTACGAC CGC 168 329 GATCCGGCGC TGATTTTCAC CATCACGTTTTTCATCGGCT GACCTGCGGC GTCTTTCACG TCGATGGTGG CGGCCATCTG CTCGCCCTTCTTCGCCTTTG CGCTTCCGGT GGTTTCATCC TGGCCTGCCA GCGTCAGCTC AGGCTGGCGGCGGCGCTGCG GGCGAGGCAA GACAGGTCTG CATGTAGTAC ATCGAGGTGC TGGTCGTCGTTTGACATCAT TGCCGTCGTT AAACAGGTTG ACCGCCGCAT AGAGCGACTT GTGCCGTCTGACGATATCAC GTAATCCCGC CACAGTAGCG CTGAGCTGTG TGCTGACTGT ATGCACTAG 169 198GATCTGGCGG GCGCGTGAAA ATATGTTGCT GGCCTCCTGT ATGGCGGGAA TGGCCTTTTCCAGCGCCGGT CTGGGGCTGT GTCATGCGAT GGCACACCAG CCTGGGGGGC GCTGCATATTCCGACGGCCA GGCCAACCGA TCGTCGTCGC AACAGTCATG GGCTTTAACG GATCAGTTTACGGAAAGTTC AGTAATAT 170 273 GATCAACATC AATAACTAAA ACTCTTTTAC CAAGATAGTTAGCCATGAAC TCAGCAATGC CAACACATAG AGTTGTTTTT CCTACCCCGC CTTTCATATTAATAAAGCTA ATTACCGATG CTGGCATAAT TATTCCTTGC TATGTTGAGA ATGAGTCATTTTGATAATTA CTCGAGCTTT TATCTTAATC TTCGCGCGTT CGAATCCTTC CCTTCATGTACTTCTCGTAC ATGGCATCCA GTTCCTTGAG ACGAGATAAT ACCCGAAGAA AAT 171 244GATCGCTGGT TCTGGCGGCA CCCTGGCGCC AACCCAAGCA ACGTCGCGCG CGCGGCATGGCAGGATCTTA CCGCCGGGCG CGTTATTATT TCCGGCGGCA GTACGCTGAC TATGCAGGTGGCGAGACTGC TGGACCCCGC ATTCGCGCAC GTTCGGCGGT AAAATCCGCC AGCTTTGGAGCCCTCCAGCT TGAATGGCAT TTGTCCAAGC GCGATATCCT GACGCGTGTA CTGAACCGAG AGTG172 247 GATCGCGCAG CGCTCTCATA GCACAAAACG AGGTTTTCCA TTCTGTTATGTTCCCTGGCG ACGATAAACG TTCGATTGTC TCATGGCGCT GGTGAACCTT ATTTTTTAACGGAGATGTTG AATGGCGGTA GAGGTTGTAC GTAATGGCCA AACCCGGCGG CGGATCTCGAATATTGATTC GGCAATATTC GTTCTATCTT GGAAAAGGAG CGCTGTACCG GAACGGAATAAAACTGCGAT GTGCAGA 173 300 GATCAGCTTG CCGCACTGTA TGCCTCCAGC GACGGCAATAAAATCCACAC CGTATCCGGC TGGCCGACTG AGTATGACTA CTGGTCATCC ACCTTCGCCAGCGCCGCTAC ATGGCAGGCG GTATCACTGG CTGCGGGCGG CTATACCGCT TCCGGCGATGCGGTCGGACT ACGTGAGCTG TCTGGTCAGC AAAAATCGAC GCGCGTCTAT CACCATTGAGCCGGTGGATG CGCATTGTGT ATACGCAACA GCGAACACGC GTGAAGGTGA AAGGCATACGTCAGCTTAAG TGACGTAAGA 174 337 GATCCGGACC GTGCCTTATA CCCTGAAAAAGGGGGAGACG GTGGCGCAGG CGCACGGCCT GACCGTCCCA CAGCTGAAAA AACTGAACGGGCTCCGCACT TTCGCCCGCG GCTTTGACCA CCTGCAGGCC GGCGACGAGC TTGACGTTGCCGGCGGTCCC GCTGACCGGC GGGAAAGGTG ACAATAACCG CCATGACGTC CGCGGTCCGTTTGCTGCTGA CCGGGAAAAT GAGGACGATC GCAGGCAGCA GATGGCCGGC ATGGCTCACAGGCGGCAGCT TCTGCCAGCC ATCGGACGTT AGGCCGCCGC GGATGGTTCG TATTCGCGTTGACATGT 175 424 GATCAATGAA GCTTTGTGGG AAGTCTTGAC TTTCGTCGAT AAATACGTAATCAAGTGCCT TTTTATCAGC TCTCCCACTA TTATTTATAT CTGCAATGGC TTTCTTACATAGGGCATCAA AATCGCCATT ACCAAATCCC CCAAATGGAA TTTCGCTAAT AATGGCATATATATCTGGTA CATTCCAGAA AAAGGTTCTT TACGTCAAAC CCCAAGAGTT GAAGCAAAAAAGTTTTTGTA CCCCATTCTA TCTGTTTTTC GACTCGCATA AATCGAAAAA CTCAGGGATTCTGGTTCTCA TTGTGGAGCA GATTATAAGC AGTAATGCAT CTAGATACGG TTTGATACTCTCTAGTGTAG TATCAGTTAC TGACAGCTAC TGCATAACCC TTTCAGCACT GAGACACGTGCGCAAATGTG TAAA 176 190 GATCATTTGA TTAAAACCTC ACACCGCAAG ATGCGACTTTTTGTAAACCT GCTTTACCGC TGACACATTT CTCCGCATTA CTGCGGAACA AGGCTTAAAAAGCGTATCCG AACGTATAAC CCTCCAACGT TCGCTACGGG AAAAATGGGG ATGAGTACTGGAAGGTCGCA TATATGACCA AGCCAGACAT 177 441 GATCCATGCC TGTGATGCCTGGATGTCCCG AATACTTGAA GGTTTGATCG AACGGCAGGC CAGTAATGGC AACGCCACTATTCTGTTATC TGCGACGCTA TCGCAGCAGC AGCGAGATAA GCTGGTGGCG GCATTTTCCCGTGGGGTGAG GCGTAGTGTG CAGGCGCGTT GCTAGGCATG ACGATTATCC CTGGCTGACTCAGGTCACAC AAACAGAGCT GATTTCTCAG CGGGTTGATA CACGCAAAGA GGTTGAGCGTTCGGTAGATA TTGGCTGGCT ACATAGTGAA GAGGCGTGTC TGAACGTATA GTGAGCAGTGAAAGAACTGT ATCGCTGATA CGTACTCGTG ATGATCGATC GATCTACCGA GCTACTCACTGGTAGGGCAG AACTTACTCA AGGCTCTCAG GCGTCTAACA GGCGTCTAAC ACGTGGAAGT T 178370 GATCGTCGTT ACCGGCGACG GTTAAAGCAA ACTGGGCATC AATGGGCCGT AAGAGTTTTTGTTCAACGGC CTCCAGCAAC CGCTCCTGGA TTGTCATTGC GCCTCCTCAC TCATTTCACCTGCAAACATA TCATCCAGTT GGTTAATTAA CGCCGCCGCA GGACGAGTGG TAAAAATACCCTGCTGCGGA CTGTCGCCAT CCACCCCGCG TAAAAAGAGA TAGATGACTG CCGCCGAAATGGCGTTCATA GTCGTAATTC GTCATTCGAT GACGAAGGTA ACGGTGCAAT GCCAGCGTATAAAGCTGGTA CTGCAAATAT AGCGATCGCG TGCTCCGCGC AGCCATGCGT CTGGATAGCGCTATCTGCCG 179 212 GATCCGGGTA CTATGAGCCC AATCCAACAC GGGGAAGTGTTCGTTACTGA AGACGGCGCT GAAACCGACC TGGACCTGGG GCACTACGAG CGTTTCATCCGACCAAGATG TCTCGCCGCA ACAACTTCAC GACTGGCCGC ATCTACTCGA CGTTTCTGCGTAAAGAACGG TGACTATCTG GGACGACAGT ATCTAATATA CGGATTAAGA GG 180 367GATCTTCTTC ACGTCTGGCT TCATCACTCT GATGAACGAT ATGCTCGGTC AGATGACCTTTAATCACCTC GCGCATTAAG CCATTTACCG CGCCGCGAAT CGCCGCGATC TGTTGTAACACGGCCGCGCA TTCATGCGGT TCATCCAGCA TTTTTTTTAG CCGCTATCAC CTGTCCCTGAATCTTGCTGG TTCTGGCTTT AAGCTTTTGT TTGTCCCGGA TGGTATGTGA CATTACAACACCTCACTAAA CATTAACGAA TACAAATTAT AGCATTACCA GATGCTACTG GGGGGTAGTATCTATACTGG GGGGAGTAGA ATCGACGCCC ACATAAAACA ACTAAGAATC ACTCATGGGTGAATTTC 181 196 GTATCACGTT TGATGCGGCT GTTATCGTCC AGATAGCCGG TGCGATAGGCAAAATAATGC GGCAATGAAA GCGCCAATCG CCAGGGGGGA TCCCCACAAT ATATGCCAGCACGACCCCGG GGAATACCGC ATGACTCATT GCATCGCATT CGCGCTTTTA CACTAAAACCCGCGTAGGAG ATCGCAATCG GACTAG 182 266 GATCTGTCGC GTTTTCGCCA GAATAGCGCGCGGAATAGAT ACCCGGCGCG CCGCCTAAAA CGTCAACGGC CAGACCGGAG TCATCGGCAATGGCGGGCAG GCCGGTCATT TTGGCGGCAT GGCGCGCTTT GAGAATCGCG TTTTCAATAAACGTCAGGCC GGTTTCTTCC GCGGAATCGA CGCCCAGTTC CGTTTGCGCT ACCACATCAAGCCAAAATCG CTTAACAGCG AGCNNCACTT ACGCGTNTGC GAGACACTTT NCTGAG 183 351GATCATCATC ATTCCGCAGC CAAACGCGCG GCTTTTACCG AACCCCTGCG CCAGACGTTGCAGGAAAAGC GCGGGTTCGT TAATCACCAG CACGCCGGTA TAGTCCACGC TGCTAAACTGAATCATCTGG CCGATCTTTT CCCGCGACGT ATCTGCCTGC CTGCCGATAA GCATCAACGCTCGGCTCGGC AGAGTAAAGC CATTTTGCCT CCCCCTGCGC GCCAACCACG CAGGCGCTGCTGCTGATAAG ACCAAATATG CTGGCTATCA CCTGCGTTTA GTGGCGATTT AGACTCATCAGCAAATCGTG AGTTGCGTTT TGCAACGAGA TTGGGAGGTT AACGAGATGA A 184 398GATCATGTGG TGATCTGCGC CGGACAGGAA CCTCGCCGCG AGCTGGCGGA CCCGTTACGCGCCGCAGGTA AAACGGTACA TCTTATCGGC GGATGCGATG TCGCGATGGA GCTGGATGCCCGACGGCGAT TGCCAGGGCA CCCGACTGGC ACTGGAGATT TAACGACTTT GCCTGATGGCGCTACGCTTA TCGGGCTTAC GCCGTCATAC CGGTTTTATA GGCCGGTATG ACGCTTGAGCGCTTATCGAC GGCGTCCTGC TTCACCGCTT TCAAAATGAC AAATTTATTG TTGGTGCTATCGTCGCGCAA TTACCGAAAT CTTCTTCAGC TGTGGAAATA GTCAGATGGC GTTCGCACATATACAGTTGC CGTGATTAGC ACACGCTATG CAATTCAG 185 347 GATCGCTATT GGTATGGCCCCACTTGCCGT ATTTCACCGG AAGCGCCGGT GCCCGTGGTT AAGGTAAATA CCGTTGAGGAACGCCCGGGC GGCGCGGCGA ACGTGGCGAT GAACATTGCG TGCTCTGGGA GCGAACGCCGTCTGGTCGGC CTGACGGGTT ATTGATGACG CCGCGGCGCC TGAGCAAAAC GCTGGCGGAGGTCAATGTGA AGTGCCGACT TCGTTTCTGT GCCGACGCAT CCGACGATTA CCAAACTGCGAGTACTATCT ACGTAATCAG CAGCTCATTC GTTTGATTTG AAGAAGGCTT TGAGGATGACCGCAAGCCGT TGCATGAGCT ATAACCA 186 294 GATCGGCGTG CTGGCGGCGA CCTGGCCGCGGGAAATACCC TGGAAGAGGC GTGTTATTTC GCCAATGCGG CGGCGGGCGT AGTGGTAGGTAAACTCGGGA CGTCAACGGT TTCCCCTATT GAGCTGGAAA ACGCAGTGCG CGGACGGATACCGGCTTCGG CGTTATGACC GAAGAGGAGT TGAGACAGGC CGTCGCCAGC GCGTAAGTCGCGAGAAGTGT CATGACCAAC GCGTTCGATA TCTGACGGCA TTATGACGCA ACTGGACCTATCGGATACTT ACTAGACTAC ATAC 187 352 GATCCGCATT GTCAGGGATA TCGCCCTGAACGCGAGCTAC GCCGGCATCT GCTGCTGATT ATTGCCATTG ATCACCGCCA GCTTAACGGCCCGTCGCCCT GGAGCTGTAC CGTAATGTCA CCAGCAAACT TCAGCGTCGC GTCAGTAGGCTAGTGGCGAC CAGCAGTTCG GCAGTACGTT TTCACCGGCT GCGGATAGTT ATGATTGTCGAGGATCTGTT GCAAGGTTTC CGAAACAGTT ACCAGCTCGC CGCGAACACA AAGTTTTCAAACAGATAACG ATGTAATTGG TCATGTTGCG CATAATCATC TCTCTTCAGT ACATTATTCACTATACGTGT TTAAATCGTA CA 188 290 GATCCTTACC GTTTTGGTCC ATTAATACAGGAAATGGATG CCTGGCTATT GACGGAAGGC ACCCACCTGC GTCCTTATGA AACGCTGGGCGCGCACGCCG ATACGATGGA TGGCGTCACC GGCACCCGTT TCTCCGTCTG GGCGCCTAATGCTCGTCGCG TTTCGGTTGT CGGGCAATTC AACTATTGGG ATGCGCCGTC GCACCCGTATGCGTCTGCGC AAAGAGAGCG TATTTGGGAG CTGTTATCCC GGCATAATGG ACACTGATAATCGAGCTCGT ATCGCAAGAA 189 213 GATCTTCAGC AACCACGACA GGAATGCCCGTCTCTTCCAT TAACAGACGG TCAAGGTTAC GCAGCAGGCG CCGCCCCGGT GAGCACCATACCGCGCTCGG AGATGTCTGA CGCAGCTCCG GCGGACACTG TTCCGGCGCA CCATTACCGCGCTGACGATA CCGGTCAACG GTTCCTTGCA ACGTTCCAGA ATCTCGTTTG CGTTCAGGGT AAA190 256 GATCGCTTTG GTTAAATCCC CGCCGCCAGT GTCGGCGCGA CCAGAGCGGAACGTGACGAT TCTGTCGGGA AGCTGCAAGC CAGTGCTGCG GCGGCCATGA GGACTTCCTGCAACAGTAGA CGCGCCAGTG CGGCGGCAAT TTCGCTGCGG CGGGTAAATT TAAGCTGATGCACCAGTAAA CTCAAGGCGG TGTATAGTCA CTGACGCTCA CCAGACTTGC AGGGTGGCGGTTTTTTCAGG CAGCGACCGC ATGGGG 191 247 GATCGTGGCT GCCGGTGCTG TCGGTGTAGCCACCACATTG ACGGCGGTCT TGGGATACTC TTTCAGCACC ATCGCCACGG CGGTCAGCGTCTTAGCGCCT GCCGGCTTTC AGCGTCGGCT GCTGCTGTCG AAGGTGACAT TATTCGGCATATTAGAATGA CTACTTACTC GCCCGCCTTC GGCTCACGCT AACGCCTGTG CCCCGATTTGTAGAGTTTGC TTCTGTACGT AGAGTAACCA GCGCGCA 192 402 GATCCATTTT AACTTTAGCGGCCCTTTTGG CGAGGAGATG ACTCAGCAAC TGGTCGGGCT GGCGGAGTCT ATCAATGAGGAGCCGGGCTT CATCTGGAAA ATCTGGACAG AAAGCGAGAA AAACCAGCAA GCTGGCGGTATTTACTTGTT TGAATCCGAA GAAACGGCGC AGGCTTATAT TAAAAAACAC ACTGCGCGTCTTCGAAAAAT CTTGGCGTTG ATGAGGTGAC GTTTACATTA TTTGGCGTGA ACGACGCGCTGACGAAAATA AATCACGGCA ACCTTTGCCG CTAAATCACA TAACGCAGGT TCTGTTCCGGTGCTGCTGAC CGCAACGGTA ATCTTTATAC CGGGCGAGTA CCTAAGAGGC TTTATGGACGACAGCGACAC GACGTTTCAG CG 193 240 GATCGCGAAG CCGCACAACG TAAGCAGGGGTTATGTAGTG TGTTCTTCAA CACCACGCTA TTCATGCCGT ACCGCAGGTA GATGTCCCCCTTAGGAGCAT CGCTTACGCT GGGAACAGCG TTTAAGCAGC TTTTTGACAA GGGAGCTTTGATGTATTGTT TGCAGTTCTA GACCTGACAC GGGCGATGAA TAGGAGCAAA GCGTGGTTTACACATCCATA TTGCTATGTT ACACTATTAC 194 248 GATCCCCTCT ATACCGCAGACAACACAAGG CGCGCTTGCT AACGCGGTGT TACAGGGCGA AATCTTTCTA CAGCGCGAGGGACATATCCA GCAACGGATG GGCGGGATGA ATGCGCGCTC GAAAGTCGCA GGAATGTTAATGCGCCAGGA TAACGCCCTC CGCTAAATTC TTGGTATTTT ATTTGGCTGG CCGACGTCGCAAATTAGCCA AAGTTAGCCA ACTTCTAGCT GATTCATCTA CGATAATT 195 304 GATCGGGGTTCAGCTCAAAT TTTTCAATCG CCCAGGCAAC ACCATCTTCA AGGTTCGATT TAGTCACAAAGTTAGCCACC TCTTTGACCG ACGGAATGGC GTTGTCCATT GCCACGCCCA TACCGGCGTATTCGATCATC GCAATGTCGT TTTCCTTGAT CGCCATCACC TCCTCTGCTT AATACCCAGCGCCTCGACCA GTGATTTACG CCAGTGCCTT TATTAACCGT TATCGAGGAT TCAAGGAAATACGACACTTA CGCACGGTAC TTCTCATTGC GAACGCATGC GCGAACGCAG TCAT 196 301GATCTGCGCC CCAGCGTTTG CAGCAGAAAA TAAAAGCCGA AAATCACCAC TAAACAGGCGATCAACACGT AGAGAAGCAA CCTCCCAATC AATTTCATGG TCTTCCATCC CGTGAAATGCACATAGGGGA TTTATGCACG ATTTGCGTGC AATCCTCAAG ACAGGAATGG TGAAAGAGCGTTACAGCAGC GGCGAATCGT GTCGCGCGCA GGGTTTTTAC GGTTTTTCGG CGGAGAATCAGTCAGCACGA TAGCGTGATG CGCAGCGATC GATGAGAGCG ATTTACCATC GGACTGAGAT T 197366 GATCCAATCC TGAACGCCGA ATTTTCACCA CAGGGCGTTG CGCTACGCCA GTTCACTACCCGCTGGGAAG GCGGTATGGT CAGAACTTCC GGCGCCTGGT TACGCGAAGG CAAAGCGCTTATTCTGGACG ATACCGCTAT CGCCGGGCTG GAGTATACGC TGCCGGAAAA CTGGAAGCAGTTATGGATGA AGCCGCTGCC CGACTGGTTG AACAGCTGAC GCTGAAAAAT TCAGGCAGCGCAATCTGGTG ATTGATATCG ACCCGGCCTT CCGTGCAAAT CACCGCTCTG ACGCTACGCGCAAACTGAGC TGTACAACCA TCATCAATGG GCTCTGAGCG CATCGACTAC GGCAGCGGAA CTTTAC198 310 GATCGCTACC CAATTCCGCG CCCACACAGC CTGCTTTAAT CCATTGCGCTAGGTTTTCCG GCGTCACGCG CCGACGCAAA TAGCGGAACA TCCGGCGGAA GTACCGCTTTCAGCGCGCTG ATGTAGCCCG GACCAAACGC CGACGACGGG AAAATTTTTA ACTTCTGTGCTCTCTGCATC CAGCGCAGAA AAGGCTTCCG TTGCCGTCGC GCAGCCGACA CACGTCATGCCATAGCTCAC CGCCGCGAAT CACTCGGTTG ATATCGCGTA CATCACTTCG CCATCGCACGTGTTCTTCGT TAGCTGTACA 199 348 TCGAAAATAC GTATACCCTG ACAGTGAAAGCAACCGATGT TGCAGGCAAC ACGGCGACGG AAACGCTCAA TTTTATCATT GATACCACATTGTGGACACC GACCATCACG CTGGATAGCG CAGATGATAG CGGCACCGCC AACGATAATAAGACTAACGT TAAAACGCCC GGGTTTTATT ATCGGCGGTA TTGATTGATT CTGACGTGACTCAGGTCGTC GTGCAGGTGA TGCGCGATGG TCACAGCGAG GAGGTGGAGC TGACCGAGACTAACGGGCAG TGGCGTTTGT ACCGGCACGC GTGGACTGAT AGGCGACTAT CGCGTACGTAGTGAAGATAG CGTATATA 200 279 GATCGGATAA CGACTCCGCG GTGGATGCGC AAATGTTGCTTGGCCTGATT TACGCCAACG GTGGGCATTG CCGCCGATGA TGAAAAAGCC GCCTGGTATTTCAAACGCAG TTCCGCCATT TCCGTACCGG CTATCAGAAT ACTGCGGGAA TGATGTTTTAAACGGTGGAA CCGGGCTTTA TTGAAAAGAA TAAGCAGAAG GTGTTGCACT GGTTGGATCTAGCTGTCTGG AGGTTTGATA CCGATACCGT TGCAAGATTC GAACGCTACG ATGCTATTT 201 272GATCGCCAGG GACGATGGCG AGCTGGGCCC CTTGTAAATC GTTTTTGGTG AGGCCGAGATGAAAAACATC AGACTTGGAC ATATAAAACT CCTCTGTGAA TCGGGTTTGT CAGAAGAAGAAAGAGACACT TTACCTAAGG ATAAAGATAT TTTGGTGCAT CATCACTATG CGTAAAACAATTGCGTGTTC CATTAAAAAG AGATGCCCCA TCACAATAAA TAATCAATAT GCAGGCATTGCACAAAGCAT AGGCGTTTAG GCATGTGTTG TA 202 401 GATCCAATAA TGACTGCATTGCCTCATACC CCATACGTAA CGCGCTATAC AAAATATAGA TGCCGATACC TAACGCAAACAGGGCATCCG CACGATGCCA ACCGTACCAG GATAACCCCA GCGCGATAAG AATCGCTCCGTTCATCATAA CATCAGACTG ATAATGAAGC ATATCGCCCG TACCGCCTGA CTTTGGGTCTTGCGTACCAC CCAGCGCTGA AACGTGACCA GTATAATAGT GCATATCAGA GCATGACGGTAACGCCAATC CCACGCGGGG TCGTTCATTG GCGTGGCTTT AATCAGATTC TGAATACTGGTCAAAAACAG AAACACGCGA ACCGGAAATA ACTACTTTGC GCGCGCAGGC ACTCGTTTACGTGCCAAGGG TTAATGGTGG G 203 169 GATCCAAAGT CGTTAAATAA CGGCGGGAAAAGCCTCCACG CCATGGAAGT GCCCCGGAAA TCGCCCCGAC CATGGTGGCG ACAGTATCAGTATCATTGCC GATATTAACC GCCGGAGATA ATAGCATCTA CGGCAGAATT CGGACAACACGCGAACAGGC CAAAGCGGC 204 253 GATCCAAAGT CGTTAAATAA TCGGCGGGAA AAGCCTCCACGCCATGGAAT GCGCCGGAAA TCACCCCGAC CATGGTGGCG ACAGTATCAG TATCATTGCCGATATTAACG CCGGAGATAA TAGCATCTAC GGCAGAATTC GGACAACACG CGAACAGGCCAAAGGGCCGG CACCGCTTCA CTCACGTGCA GCCGGAGCAA TATATAGCAG TTCACACGCGTTCCATGGAT GAGCTTCGAT ATAGCTCAGT ATG 205 198 GATCGTACAG ACCCGCGTTGTCATAACCAC GGGTTTTTAG TTCCGCCACA CGCTCGCCCG CCAGCGTTTT CATATCCTCTTTCGAGCCAA AATGAATGGC GCCGGTTGGA CAGGTCTTCA CGTCAGGCCG GTTCTTGCCGACGTGTCACG CGGTCAACGC ACAGCGTACA TTATGACGTC GTTGTCTTCC GGTTGAGG 206 411GATCGGAATG CCTTTGAACA GCGGCAGGTC TTCCAGCGGC AGTCCGCCGG TCACGGTCACTTTAAAGCCC ATATCGGACA GCCGCTTAAT CGCGGTAATA TCCGCCTCGC CCCACGCCACGGCTGCCGCC TGGGGTCACG GCTGCGGTGA TAAACCACTT GCTGAATACC CGCATCACGCCACTGCTGCG CCTGTTCCCA GGTCCAGTAA CCGGTCAGTT CGATCTGCAC GTCGCCGTTGAACTCTTTCG CCACATCCAG GGCTTTTGCG GTGTTGATAT CGCACAGCAA ATCACGGTACCAGTACGGTT GGCTTCGAAA CACATACGGG AGAGGATTTA CGAATGCATT GGGAGAGATTGGGTAGGTCA GTAGACGAGA ATGCAGAGAT GGCATGAAGA TTGAAGGGTA G 207 402GATCCTGAGC CGGGTAGCCA GTATTTGCAG GCAGCAGAGG CAGGTGACAG ACGCGCACAATATTTTCTGG CCGACAGTTG GTTGAGCTAT GGCGATTTGA ACAAAGCTGA ATACTGGGCGCAAAAAGCCG CCGACAGTGG CGACGCCGAC GCCCTGCGCG CTACTGGCCG AAATCAAAATCACTAATCCG GTAAGCCTGG ATTATCCCGA CGCGAAAAAG CTGGCTGAAA AGGCGGCTAACGCGGCAGTA AAGCGGGAGA AATTACGTGG CGCGGATCCT GGTCAACACC CAGGCCGGGCCGGACTACCA AAGCCATCTC GCTGCTGCAA AAGGCCTCTG AAGATCTGGA TACGACTCGCGTGATCGCAA TGTGCTTGCT ATTGACTGGG CATCTCGTTA AA 208 288 GATCAAACGCGCTGGCGTAA TCGCTACTGG GTTGATAGCG AAGGCCAAAT TCGCCAGACG GAACAGTATCTGGGCGCGAA TTACTTTCCG GTGAAAACCA CGATGATTAA GGCGGCAAAA TCATGATGAAAAGGACGATA AGCGCGCTGG CGTGGCCTTT GTCGCGTCAT CCGCCTTTGC CAGCGGCACTGTTACCGTTT TTACCCAGGG TAATAGCGAG CTAAAACGCT GACAGACGCT GAGCGCTCGCTCGATTAGTG GACAGCGCGC TGCACGAGCT GGTGGCTG 209 169 GATCAGGGAA CCTGTACCTCTTAAAGAGAA GTTCGATACC CCCAACGGTC TGGCGCAGTT CTTCACCTGC GACTGGGTAGCGCCTATCGA TAAACTCACC GAAGAGTACC CGATGGTACT GTCGACGGTC CGAGTCGCCACTACTCTCCG TCAATGACCG GTAACTGTC 210 311 GATCATCTTC GTCCTGCTCT TCCTGACTCAGCGCACTGTT TACGACAATA CTGTCCGCAT CTCGTTGTGC GATTTTATCG GCGACGTCGCGGGAATAATC GCATATTCAC ATTCACCGCT GTTATTGATA ACCAGACGGC AATCGCAGACGCCCATTAAT CAGTTGCGTC TGAGTGAGCT TATCCACGTC TATTTTTTTG ATGACGTTATTATCGGTGAA GTTAAAACCA ATATCGCCTT TAGATACATT GATTCTATTC ATTTCAATAAGTTGCTTAAC CTGAGCTTTA AACTCTTCGC TAAAACCGCT G 211 368 GATCAGTATCATCAGTAATG GCCAGCGTTG CAGTATTCTG AATAGCCAGT GAGGTTTTCA GCGGGAAAATGGCGAGGGTA TACGGAACCG GTTCGGTGGT GCCTTTTGTA GCAACGGTAA ACATTTCCATATTGCCGTTT TTGATAATCC GGTGGAAGAC TTCTGCCAGA CTGGGCTATC AACGGTTCCTGAGATAGCGT CAGATTTTAC ACCATCAGCG GTAACGTCGC GTATCGGTAT AAATAGAGAACGCGCCGATT TTTACACCTT CGGTTGTTTG CCAACGCGAG ACATTGTGGA TCAGATACTATACTATAGTC ATATCGCATG GCTATGAGAT ACGAGTGCCT GGTGGTGTGC ACGTATGA 212 258GATCATCCAC TCATCTTTGC CGGTTGAGCC CGATAGTTAC CCGTTCAATA CCGGCATCAATCGCCCCCGT TTTATTCACC ACCCCCAGAA AGCCGCCGAT AATCAAGACA AACAGCCGCGACGTCAATGG CGCCGGCGGT GTAGGTTTCT GGGTTATAGA GGCCGTCAAT CGGCGCCAGCAAAACAGCGG TAATCCTTTC CGGATGCGCA CGGGGCATAC GCTCCGCACC GACTTTCAGAGCTGCTATCG ATTGATTT 213 322 GATCATTGTC ACGCCATTTT TTTAAATTAT TAGTATGGCGTGTGGAGACG CGTATCTGCT CACCAATATA CGTATTGTCC ATAGGCGTAG ACAAGCTCCATTGCTACAAA GATAATTTTA TTTAAGTGTC AGGAAAATTC CGGACAAATC CCTTTTTTAATAAAAATACA CACTCTCGGC ATGGGATAAT ACTTAATTAA CTTTTGTTAG CGTTTTGAAATTAAAAACAG CGCAGAGGTA ATAATAGAAA ATAACGTTAA CAGGCTGGGT GAGTATATTTGACTGACACA ATTCCAGGTG TATATGTATG CGTTTATGCA TG 214 320 GATCATCCGCAGAAGAAAAA ATATGGCCGC GTAGAGATGG TGGGGCCGTT CTCCGTTCGC GACGGAGAGGATAATTACCA GCTTTACTTG ATTCGACCGG CCAGCAGTTC GCAATCCGAT TTTATTAATCTGCTGTTTGA CCGCCCGCTT CTGTTGCTCA TTGTCACGAT GCTGGTCAGT TCGGCGCTCTTGCCTATGGC TGGCATGGAG TCTGGCGAAA CCGGCGCGTA AGTTGAAAAA CGGGCTGATGAAGTGGCGCA AGGCAACCTG CGTCAGATCC GGAGTGGAGG GGAGAGTTCT GGTGCAGTTTAACAGATCTA 215 277 GATCAGATGG ACCACAACGA GCACCGAAAA CAAAACGGCGCTGACCATCA GAATGACGGT AGTGCCGAGT TTCATGGGGC GTTTGCGTAA CGCCGGCATGGCAGGGAGTG TTTCATAGTG GACCTGAGCG ACGAATCGTA AGGTTATTAT CCCTGATGAGGCTCTAATTC AAAGGCATAG GCAGTCGTCC AGTGTGAAAG CCGCTGCTGC AGGCCGCTACTGCATCGTAT ATCGGACGAG ATTTCAATCA ATAACACGCA ATTTCCGCAT CCAACCG 216 330GATCCTGAAA CGCTGACCAG ACGCCGAGCG CGCCGTACCA CGAATCTCCG GTGGCACTCTGCGCACAACC TCTACGCCCA GCGATGGGAA CATCAGCGAA CAGCCGCAGC CGGTAATCGCCGCGCCAATC AGCGAGCCTG CTGACGGAGC GGCCCACATT ACCGCCAGTC CGGTCCCTCTACCAGTAGTG AAAAGGTTGC ACCGTGCGCG CGTAACGGTC GGGAAATTTG GCGCAGAAAAGCGGACAGCG ATAAACGCAT CAACACTATG AAACGGTGAT ACAGTAGTGT GACAGAGTGTATCTAGTGAC ATCTGACAAC TTCTCTCAGC 217 223 GATCTGGGCG AAATCGCGCGGAGTCTGGCG GCGGGCGATA TCATTACCCA CTGTTACAAC GGTAAGCCGA ACCGTATCTTCGGCCTGACG GCGAGCTGCG GCCTCGGTGA CACGAGCGCT GGCCGGCGGC GAGGCTATGGAGTCGGCATG GTACCGCCAG TCCTGAGCTT TGCGTGGCTA ACTCGCTATA GCTGGATTTACCGCATACAT CAGTCGATAT CTC 218 316 GATCGCCACC GTTTTGTGAT GCGCGCCAATTTGGGCTGGA TAGAAACCGG TGATTTCGAC AAAGTTCCGC CGGATTTACG TTTCTTCGCCGGGGGGACCG CAGTATTCGC GGCTATAAAT ACAAATCTAT TTCGCCTAAA GATAGCGACGGCAATCTTAA AGGCGCCTCA AAACTGGCAA CCGGATCGCT GGAGTACCAG TATAACGTCACCGGTAAATG GTGGGGGCAG TGTTTGTCGA TAGCGCGAGC GTGAGTGATA TCGCGTAGCATTCAAACCGG ACGCCGACCG ACCGACCGTG GCTTCAACCT ATTCAC 219 182 GATCTGGGGTGGGGGATTGT TGATGGTGTG TGGAGCGCTG CTGAGCGGAT GGCGGGGGAG GAAGCATCCTGAGTTATTGC CTGATGGCGC TGCGCTTATC AGGCCTACGA GTGAAAAGCA TGGTAGGCCGGATAAGGCGT TCACCGCATC CCGAAAACGA TGTTACTTTT GGCTTTACTG AT 220 419TGCAGATCAA AACAGCGACG GCTGGCAAAA GCGGTAAAGG TTTACGACCG GTCAGCGCCCCAGCCGCCGC CGTGCCAATC ACATTCGCCT CCATAATACC GCAGTTAATG ACATGCTGCGGGTAGTCACG CGCCACGCTG TCATCGCCAT TGAGCTCATT AATCAGCCTC AGGATATGGCTTCAGCCTCA AGCGCAATAA TTGGGCTTCC GGCCTCAATC TGCCCGGCGA TAAAACCGGCGTAAACTTTG CGCATTTCGA TATCGTCTTT AAGCCCTGGG AAGCTTAATC ATGCATGACCTCCAGTTGAT GAATGGCCTC ATTGAACGTT GCTTATCGCA TCGTCAGCGT AAGTGGTGAGAATTCGTTAA CTGCTCAGGC ATGCACCCTG CCTTATGCTG TCAAGGATCA CACCGTGCT 221 126GATCTTATGA CATTGTGAGT ATCCATCGCT TTTTGTACTG AGCTGTAGGC AACTCCGACAGCTTTTGCTC AGCAGCTGTT GTTTCTCATA AGCTAGTGAC CAAGCTGCTG CTACCACAGG TCTGGG222 192 GATCCTGCAC GCACGGGCGC ACAGCACCGA CAAGCTGTCC AGCTACTTGACACAGCGCCA GCGCGTGCTA GCGAGCGAAC CCGCAGGTGG CACATGGCGG GGACGGCGAGCAGGAGACAG GCTAGAACGC TTTATGTGCG CACTATGCTA TCAAATAGGC CGTCCGGCTGCACGCCGACA CTACCCTGAC AA 223 331 GATCACCGCA TCGCGAACTG GTTACGGGCCTGTGGAGCGT ATTTTTTGAT GTTATTGGTA TTCATAGAAA ATCCTGCAAA GGGCAGCAGAGCGCTGCCCT GAAATGGGGG TTACTGAAGA CGAATCCGGT CACCTGCCTC AATAGCTGCCAGCAGCGAAG TACGAAGCGT ATCCAGCGCT TTTTCCACCT GTTCGGCGGT TTCCAGCACTTCGCCACCGG TGGCTTTGCG CATCTCGCTG GCGACATTCA CCAGATGCGT TTTTTCGGTACCGGTTGGAT AACGGTTCTC TACCACAACA TAAGCTCGTT GTGACTCGGC GCCTTAGCTT A 224410 GATCTAACGT ATCACGACTA AACGTAAGGG TAAAGCGGCT GGCGTATCGT CCGGGCATAAAGTCATATCG CCTGAACAGA TAACATCTCA CTGACTTTGA AACGCGATTT TATAATTTGCTGCCCAAAAA TACGTGGCGC TGAAAGGCGC ATTTTTGATG CAAATCATTT ATTACTGTGATAACACTGCG CGCGATAAAA CATTAATATA TTCACATAGT AATATGTTCT ATTGGAATGGTTGTTTCGAT ATGACAAAGT CTAAAAAACC ATTGATGTGA AAAGGAATAA GAATTGTCTATATTCCGATT CGGTGGAATT AAGTATTCTC GGATAAAATA GAATGATATT GATATTCTTTTGATATGGTC TATAGCGCTA TGTATCAGAC GCGTGATCGT CGGAGATCAG 225 185GATCTTCGAC TGCCGCGCTT CCGCGACAGC GACATACGGG TGTTCTTTGT CGGTGACGTTTATCCGTTGT CGTGACCTTC ATCCGGTGGT GAAACCTGAG CCGAATAATA CTGTACACCACCACCAGGAC AGAATACTCA AACCACGTTC ATGTGATTGT TGCACCACAT ATTCATTGTT GGAAC226 276 GATCCGCTGA CAGATGTCGT GTACAGCATT CTTTAGAGTG GAACGGTGACCGTACCGCAA AGCTGTGAAA TCAACGCCGG ACAAACGATT CTGGTAAATT TCGGCGCATTATACAGCGGC AATTTCAACC ATGCAGGCCA AAAGCCGGAG GGGGTACGAG CGAAAAAATTCAGTCGCTTC CGGTAAAGTG CAGCGGTCTG GATTCGCAGG TCAATTTAAC AATGCGTCTTATCGCTCCGC GGATAGCACG TCCAGCTATC GCTCGATATG CGATGT 227 383 GATCACCGACCGGACGGTCC GTACCTGGAT TGGGGAGGCG GTTGAGTCCG CAGCGGCTGA CGACGTGACGTTCTCAGACC CGGTGACACC CCATACTTCC GCCACTCCTA TGCGATGCAC ATGCTGTACGCGGCATACCG CTGAAGGTGC TGCAGGCGCT GATGGGACAC AAATCGGTGA GCCTGACGAGTGTACCGAAA GTGTTTGCGC TTGATGTTGC CGCACGACAC CGGGTGCAGT TTCAGATGCCGGGTGCTGAT GCAGTGGCTA TGCTCAAAGG AGGTTCATAG AGACGTGTAT GCATTTTCAGCTTCGCTGCA CAGCATCGAA CGGAGTTTAC GCGTTTATCA GCCATGTCTG CGCACAGAGGAGTGTGCTCG AAA 228 357 ACTTGCCGGT AATTTCCATC CCTTCCAGCA CCGCCATCTCTTTACCCTCA ATGGCGATGG ACAGTTTATC CAGCGTTAAC TTTTGGTCGC CCCACGTTCGCCAAAGCTTG CCAGTTTACT GGTACCGTCG GTTTTCAAAT TATTAAAGGT GAGTTGGACCTTCTGATTAT ATTCGTTAAC GGCATCGACC AGGCCGCTCT CGCGCTTCGC CTGACAGCGAAACCACATTA CCGTCTTTAT CGGGCGTTAA CGGGAACTCG GCGCCGCTAA AGGCACCTTTACCGGCATTC TCTGAGTTAA CCGGCTTGAG AGAGATATCG GAGCGGTATC GCCGCCATACATGCGGTATT GATACAA 229 225 GATCTATTTC GGACAGCCAA AAGGCCGTGA AGGCAGCGGTCAGTACAAAA AGCCTTTGAT ACCGAAGTTT ATCACCGGCT TTGAGATCGA GCGCAGTTGCCCGTATGCCT TTGAATCGGC GCGTTAAACC GGCCGTAAAG TACCCTCTAT TGATAAAGCCAACTACTGCA AGCTCTATCT GTGGCGTGAA TACGTCAATA GTGGAAAACG TATCCGATGT GAACT230 275 GATCGTTAAA CAGATTGACC AGTTCGCCAC ACTCTTCCAG ATTAAACCCCACCTGCCTCG CCTGTCGCAG CAACGTCAGC TCGTTTAAAT GCTTCTGCGT GTAGGTGCGATAACCATTTT CGCTACGTAA TGGCGGCGTC ACCAGCCCTT TCTCTTCATA AAACCGAATGGCTTTGTGGT TAGCGTTTTG GCACATCGCT ATATCATATT GCCCTGCCTA CTGCTGAGTTACTATACGGG TACTACGTCT AGAGATCGCG AAAAGGTTAC AGTAC 231 233 GATCGACGTCGCCTGATTTA AGACCCGCAA GCAACATCGT ATTGTTCATG GTCGCGACCT GTAACGAGGTCGATTTTTGC TGTTGATGGA ACCGCCCAAT AGCCGCCGGG AGTATACCCA GCGCAGGTGGGGAGCGGCAA CACGCACCAT CGGCGCTAGC TCCTCTTTGG CGATTCGATC GGATCCTGGCGGTGGTATTC ATGATCTAAT CCTTTTATCG ATGAGTAAAA TTG 232 358 GATCGGCGGAGAATCCCAGA CAGGCCAGGT CTTTCAGCTC GTCGCGGGTC ATCGGGCCGG TAGTATCCTGAGAACCGACG GAGGTCATTT TCGGTTCGCA GTACGCGCCC GGACGGATAC CTTTCACACCACAGGCGCGA CCGACCATTT TCTGTGCCAG CGAGAAGCCA CGGCTGCTTT CCGCCACGTCTTTTGCTGAC GGAAAACGTC TGAGTGCGCA GCCAGCGCTT CACGCTTTTG TGCTAGCACGCGATATCACG ATACACACGC ACGACTCGTC ATCAGCACGT CGTTCAGTCG AGTGCAGTAGCGCGTCATGA TGCGTACTGC TTGACGTAGA CTATCATGCC ATATCAGT 233 302 GATCCACAGGTAGCGTGATG CGTTTTAGTT CCCCCTGCTG CTCAAGTAGC GTCAGGCCGT CGCGTAAATCGTGATATTTC ATGGCGTCCA TTGTAGCCTC TTGGTAAGCG CATCATTATA CGGCGTTCATCATCGGGATG CTGTATTTTT GTTAAATTAG CGTGAACTCT GGCAACCAAC GCTAATCCAGATACGGCTTA AAGGATGAAG TGTATATTAA CTTCGCGCAT GGCTTTTGCT ATGCTTGCGCCCCGAACAGC GATAAGAGTC ATATGCATCT GGTATTTACT GTACTGCAAA CG 234 374GATCGTCACC TCCACCCTCG CGCGCGGGGC GGTGAAGCTC TCGAAACAGA AAGTTATCGTGAAGCACCTT GATGCGATTC AGAACTTCGG CGCGATGGAT ATCTGTGCAC TGATAAAACCGGCAACTCTG ACGCAGGATA AAATTGTGCT GGAGAAATCA CACGATATTT CTGGTAAGCCCAGCGAGCAT GTCTGCATTG CGCCTTGCTG ACACATTATC AGACCGTCTA AAAAAATTTCTGATACGCGT CTGAGAGTAG ACAACGCGGT CACCTCGACG TGCAGAAAAT CGATAGATCCGTTATTTAGC GTGCGATGTC GTAGTGTGCG AGATCGACGT GCATCAGCTG GATCTGCAAGCTAACGAGAC TCAC 235 355 GATCGGACTT TATTCGCGCG ATAGTCACGG AAAAAATGGTTTAACTTTGC TAATTCATCC TGAATGTAGG CTCTTCCATC GAAAAACTCC GCCTTGATTGACTCTCCGGT ATGGAGATTG TTTAACGTCA AAAATGCGCG CCGTGGGGTC GAGAGTGTGGCAAACGCTGA GCGCGGGCAG GATGGCGGCG CGAGAGCGAC ACCACCAAGC GCCAGAGCTTGCGCGATTAG CGTCAAATTT GTCATGATAA TCAGGTCTAC AGGTCAATGT TATCGTTAATACACTTCTAC CTTTAAGCAG ACATGATACG CTGACACGAC TCTACGCGTG ATAGTGTGATACTTGGCACA GACTA 236 363 GATCGTCACG TGATTTGCCC GTCACGCGAA TCTCTTCCCCCTGAATTTGC GCCTGCACCT TCAGTTTGCT GTCTTTAATC AGCTTGACGA TTTTCTTCTGCACGGCGCTT TCAATGCCCT GCTTCAGCTT CGCTTCCACA TACCAGGTTT TACCGCTATGCACGAACTCG TCCGGTACAT CCAGCGAAGC GCTTCAATAC CGCGTTTAAG CAGCTTGGCGCGCAGAATAT CGAGCAACTG ATTGACCTGG AAATCGGACT CGCTCAGCAC TTGATGGTTTATTGGCATCG TTCAGTTCAT AGTGCTCTAC GCACGGAGTC AAACAGACTC ACTGGAGCTATCACACGTAC GCGCTCTCGA GAT 237 320 GATCGTTAAT TAGGCGCTGG GCGTGCTGGAGCAGTAATTT ACCGCCTTCC GAGGGGCGTA GTCCTTTACT GTGGCGCTCA AAAAGCGTGATGCCTATCTC ATCTTCGAGT TGAGATAGCC ACTTCGATAG CGCCGCCTGG GAGATATTCATCATCCGGGC GACGTGTCCG TTCAAGGGTT GGCCCTGTTC GGCCCAGCGC AACCAGCGTTTGCGGTGATG TAATTTCAAT TTCTCCCGTT CCATTCGCTA TAACCTCAGG TTATGTCTCTCCTGAAACCA TTGTACTTTA TCCTCCTCTA CACTCGTACT GCACTAACAC 238 406GATCCTGCAA CGCTTTCGAC CCGGTCGAAA TAATGACTTT TTTCCCGGCG CGCAACGCCGAGCGAGGTAA GCATAGGTCT TCCCGGTTCC GGTGCCGGCT TCAACAACCA GAGGCTGCGCATTTTCAATC GCTTGTGTTA CGGCAACCGC CATTTGTCGC TGTGGTTCGC GCGGTTTAAAGCCGGTTATC GCTTTGGCCA GTTGGCATCT GCTGCAAAAT CGTCCGTCAC ACTGCCCCCTGTTAATTTGC ACAGGGATTA TGTCAGGGTA GAAAGGCTTA CACAGTTACA GAGGTGACGGCGGCACATTG TGCAGTCTTG AACCATTCAA ATGAAAAGCA AATGAGGAAT AAGTAATGTCTATCGTGCGT ATGATGCGAG ATCGTGTCAG ACGTGTGACT CAATAT 239 263 GATCCTACCGGCCCCCACGC TTTGATTTGA ATAATAGAGG CTACCGACGA CAGCGACATG CTGATAATGTGCTGCGTATC CTGCGCCGGT AAACCCAACG CCTGGCAGAT TAACAGCGCT GGCTGATTACCGCGACAAAC ATGCCACGAG ATGCTGACAA GCGCAAAAGG TTGAGGAGCG CGGCGATCTTCAAGACGGTA AATTAATCGC TGCACAATTG TACGCGACGA TGCATCTCGC ATGCGTCTACGACATAGACA TCT 240 364 GATCAACGCC TAATTTGGCC GCACAATCCA GAGAGACCTGCGGGTGCGGT TTGCTGTAGG GCAATTTTTC TGCAGAAGCC AGCGCGTCAA AACTGTCGCGCAGTTCAAAC ATGGTGAGCA CTTTTTCCAG CATATGCAGC GGCGATGCCG AGGCAAGCCCCACTAATAGC CCCTGCGCTT TACACAGCGC CACAGCTTCG CGCACACCCG GCAAAAGAGGGCGCTCTCTT TCGATAAGCG TAATCGCGCG GGCAATAACA CGGTTTGTCA CTTCTGGCGATCGGGCGTTC ACGTTGCTGC GCAACAGAGA TCGACAACCA TATCATGCGT AGCAAGCTGTTGCAGCTCAT GGCCGAGTAT ATCT 241 221 GATCATTTTA ATGCTGTGTC TTGCCATTTTTTTCTCCATA AATTTCAAAA GGAAATCATG CCTGATGCGC ATTGCGACGG CGTGAGTACCATTCAAGGAT TTGGTGACGA TGCAAACTGA TGGAACGACC AACGACAACA ACAATGAGAAGCGCACCGGA CAATGCGCTG GAATTGATTC GGCACTCCGG CCATCTGTAG CCCTCGTGTAAATCCACCAG C 242 280 GATCATCGAC GTATGTCCTT TCCAGATATT CCGCCCGCCGCCAGCCCACT CAAACAACGG GGGGCGCCGG CAAAAAAGCG AAAGACATCC ACCGATTGCCGGAATTTATA TTAATTACGC CAGTGCAAAG GCTTATTGCA GTTTTGCGAT TCAAGCCGGGCGAACTCAAG GGCGTTTTGC TCGATGCTGT CCGCAGTTTT AACAGACATT CCGCCCGTGCTTTGGGTGTG GTCTGCCCAT TCGGAAACGC GTTATCGGCG GCTGATCGCA GCGTAACCTG 243277 CACTATAACA ACGGCGCGGC GGTACCTGGG CGACGTCGCC AGCGTCACCG ACTCGGTGCAGGATGTCCGT AACGCCGGGA TGACGAACGC TAAACCCGCT ATTTTGTTGA TGATCCGCAAGCTGCCGGAG TGGAATTCCA CATGTGGAAT TCCCATGTCA GCCGTTAAGT GTTCCTGTGTCACTCAAAAT TGCTTTGAGA GGCTCTAAGG GTTCTCAGTG CGTTACATCC CTAAGCTTGTTGTCACAACC GTAACTAAAC TTAAACCTAT ATATCCT 244 380 TGCAGATCAT TGCCTGATGTTCTACGGTCG CAAAATGCAC CAGNNNNCAG AACAACGACA GCGACAACAA TACGGCTGAAGCGCTTTAAT CGCGCTAACT CCTTTTTCTC AAAGCCCCTT TCCGTTCACC TGCTATAGCGTNGAGGGGCC CACTTACCAG GAACAAGACT ATGAACGTTA TTGCTATCAT GAACCACATGGGCGTCTACT TTAAAGAAGA GCCTATTCGT GAACTGCATC GTGCACTGGA AGGTTTAAATTTCGTATCGT CTATCAAAAC GACCGAGAAG ACCTGCTGAA GCTGATTGAA ATAACTCCGCCTTTNNGTCA TTTCGACTGG GATAATATAC CTTGAGCTTC GAGAGAGATA GCAGTGAGCG 245353 GATCTGATTA TCGACGCGCT GCTTGGCACC GGCATAGCCC AGGCACCGCG CGACCCGGTAGCCGGTCTGA TTGAACAGGC GAACGCATCC TGCGCCGGTT GTCGCCGTCG ATATCCCGTCAGGTCTGCTG GCGCAAACGG GCGCACGCCT GGCGGGTGAT AAGCGCGCGC ATACGGTCACGTTTATCGCC CTGAAACCAG GCCTGCTGAC CGGCAAAGTG CGTGAGCTTA CCGGCATATTGCATTATGAC GTTGGGACTG GAAGGCTGGC TGGCGAGCAG ACGCGCGTCG GTTTTGAAGAGAGTTGGGGC AATGGCTAAC GCGTGACGAC TGATAGGGAT ATGTGTAGAT ATG 246 376CACCCGGCTG ACTGCCGTAT AATCCAGCTT TTTACGCGGG TCCGCGGAGG GTTTTGCCGTCACAGAGAGC GTATTCTGCG AGTTTATGGT TGTCTTACCT AACGGATAGC CTTCGCTATCATAGCGGTAC TCGACCCTTC ATCTCTTTGC CCGTCGCCGA TACCACAAAA CCGTTGTCGTCCGTTTCCCA GGTCACGCCC GCCGAACGAA CGCCGCCAGC TGGCACTTCC CCTGTAACTGCACCTTTTTT TCCAGCGTCT GAGCATCCCG GTAATAATTG GCATCCAGCA CGAGTGCCAGCCCCGTATTT ATCTCCAGAT CGTGTAACTC AAGCGTATCA AAACAGCCTT CCTGTGAAAGCGTACCGCGA CCTCTA 247 248 GATCAAGACG CGAATCCCCG ACGCGCCGAT AACGCCGTACAACAGCAGCG AGACGCCGCC CATCACGGGT AACGGGATAA TCTGAATCGC CGCCGCCAGTTTGCCAACGC AGGAAAGCAT GTAATAACGA AAATCGCGTC GCCGCCGATA ACCCAGGTACTGTAAACGTC GGTGATCGCC ATGACGCCAA TATTTTCCAT AGTGTATCGG CGTGAGTAGAACCGAATATC GTCGACATCT AGCACATC 248 253 TTTCGACAAA GCGCGCCGCC GAGATATTCGCCATGATCAT GCACTCTTCG ATAAGCTTAT GCGCGTCATT ACGCTGGGTC TGTTCGATACGCTCAATGCG ACGTTCGGCG TTAAAGATAA ACTTCGCCTC TTCGCACACA AACGAGATCCCCCCGCGCTC TTCACGCGCT TTATCCAGCA CTTTGTAGAG GTTGTGCAGC TCTTCAATATGCTTCACAGC GCGCATATGT CACGCAGATC TGATCGCTGC AGC 249 414 GATCAAACACCAGACGACCG CGACGCGCAC GACCATCGGT GGTATCTAAC TCAAATTTCA TTATCACTCCTGCGTCAGAA AAACAGTCCG ACGTTTAACG ACTCGCTACG GAATGATTCC ATAGCTAATAAATTCCCGAA GACGTCATCG GCGCAGAGTT TGGGGTCGAC CAGCGCACAG CCACCGGAGCGTACACGCAG TACGTGAGGA TGGCGAGCAC TGCCGCGTCA AATGCAGTGA GATAGCTCTACGACGTCAGA ATAGCTGCGA TGTACGTGAT AACTGCTCCG TAGCTAAAAG CATTTGTCTACGCAGTCTAT AGGCATCATG TGTGTGATAC GCATGCGAAC AGCATACACG TGATCGCAGATGAGTGTGAT CAGGCATATA CTGACGAACT GATATAGATT CGTG 250 112 GATCTTCCGGGTTCACGGCC ACGCGGTAAT TCTGCCGAGA ATAGTTTTCG GGCGGGTGGT GGCGACAACCAGAAATCTTA CCGTCGCGGT TTTCGCGCCG TCGGCCAGCG GA 251 345 GATCGTTAAATGTGCGGTAA TCCTGTGATG AATACCGATA CGCAGCCAGA CCAAACCGAG TTAATGTTTGGGTCAGGTAT TTATTATAAG CAATCTGATA ACTCTGACCA TCAAATACGA CGCCATTATCCTGTTTACTG TGCGCTCGCG TAGCTCAAGC GAAATGGCGC CAATCCGGGT ATTCCACCCCGTGCCGAGGG TAAACGCATT ATAATGGTTC GATAGCATCG TACGCATAAG CGTCAACAGGTTATTAGGCA TACTGATACT GATTGGTAAA TCGGCTGATA TCGGCGCTTC AATTATGACTACGCGCGAAA TCATACTGAG CCGTCCAGTC CATTC 252 203 GATCGGTCGC CGCCTTACCTTTTTCCAGTA CACTGAGCAG TTCGCTCAGC AGTTGTTCAA CAGCTCCATC ACTAGAGCGGGAGAGTTCTG GCATAAATCA AAATCTGTTT GTTCATGAAA CGGCAACACA TTAACCGCAGCAACAGTTTT TTTCTGCATT TTTCGGCCTA AATCATCGCC TTACGATACT CTGAATACAG GGG253 273 GATCGTAATC ATTCACTTCG GTCAGCAGCT CGAGCACTAA CGCGTCGAGCACGCCTTCCA TCGGCGCCAG TAAAACACGC ATATCGGTAT CCACAGCAAA AAAAGAGGCGCTATCATAAC GCCTCTCTGC GATGAGCAAA ACTTTTTTGC CGGGTGGCGG CGCAAACGCACGCTACGTAC GTAAGTGCTC ACGCGGCTTC AAGACCAGTT ATTTTTCCAG CCGACCAGCCATTCGAACCG CGATAAGCTC TGCGATCCTT TCCAAGTATG CTG 254 154 GATCTTCTCGCTTTCTTCAG GGCTTACTCC CGTCTCTTCT TCATCGACCG TGATCAAAAT ACCGTCTTTATCCACCAAGA AGCCGACTTC AATCTTCGTA TGAAAATAGC TCACCATTAC GAACTATATTTTTCATCTCT CTTTCCAGCT TTTT

There are many examples where highly-linked virulence genes are involvedin the same stage of pathogenesis. Consequently, identifying the maplocation of the coding sequences of the present invention to aparticular region of the bacterial chromosome is informational.

MAPPING PROTOCOL

A bacteriophage P22 lysate is made on the fusion strain of interest andused to transduce a recipient strain such as wild type S. typhimuriumstrain ATTC No. 14028. The resulting tetracycline sensitive, ampicillinresistant fusion strains are grown overnight in LB Amp and thentransduced on LB Tet X-gal plates using a bacteriophage P22 lysate madeon a pool of random Tn10d-Tc^(r) insertions. White Tet resistantcolonies represent either spontaneous Amp sensitives where the fusionhas segregated by homologous recombination between the direct repeats ofthe cloned fragment or replacement of the region brought in next to theTn10d-Tc^(r)-element.

To verify and measure the linkage of each candidate to the parentfusion, white Tet resistant clones are made phage free and phagesensitive. Bacteriophage P22 lysates are grown on them and used totransduce the parent fusion containing strains again to Tet resistanceon LB Tet X-gal plates. Linkage is visually apparent by an increase inthe number of white colonies. Strains containing the Tn10d-Tc^(r)insertions next to the fusion locus are used in the next step, mappingby the method of Benson & Goldman, see Benson N. R., et al., J. ofBacteriol., 174:1643-1681 (1992).

A selection exists for the loss of the tetracycline resistancedeterminant of Tn10d-Tc see, Maloy, J. R., et al., J. of Bacteriol.,145:1110-1112 (1981). Plates containing fusaric acid will allow thegrowth of tetracycline sensitive strains over tetracycline resistantstrains. In conjunction with this, a set of Mud P22 phage lysates whicheach package a small, defined region of the chromosome is used totransduce each Tn10d-Tc containing strain to Tet-sensitivity (availablefrom Salmonella Genetic Stock Center, Calgary, Canada). The lysate thatproduces the most Tet sensitive colonies packages the region where theTn10 lies in the chromosome and by inference, the location of theoriginal IVET fusion.

After assigning each fusion to an internal donor, lysates grown on allthe Tn10d-Tc containing strains in an interval are used to transduce allthe strains with IVET fusions in them to Tet resistance on a Tet X-galplate to test for linkage of each of the fusions to the others as wellas Tn10 insertions in known genes already mapped on the chromosome toprovide anchor points if possible.

In addition to the map locations of each coding sequence of the presentinvention, the defined sequence data presented previously has beencompared to published sequences and known genes having homology to thecoding sequences of the present invention are cited in Table 4 below.

Table 4 below represents (i) the known map locations of each codingsequence of the present invention; (ii) known genes that sharehomologous regions with the coding sequences of the present invention;(iii) the type of IVET plasmid that the coding sequences of the presentinvention were originally cloned into; and (iv) the type of tissue eachcoding sequence of the present invention was derived from. It is to beunderstood that while each coding sequence of the present invention wasderived from a specific internal organ or macrophage, that does notimply that a gene transcribed or genes cotranscribed with each codingsequence are specific to that particular tissue type. For example, SEQID NO. 82 was derived from both intestinal and splenic tissues.

TABLE 4 Seq ID # Vector Gene Between Loci: Tissue 14 pIVET1 cfaaroD-pyrF intestine 80 pIVET1 pgm cobD-putA intestine 13 pIVET1 cadCcysA-purG intestine 247 pIVET2 uraA cysA-purG intestine 8 pIVET1 and 2argE ilvA-melA intestine 76 pIVET2 oxyR ilvA-melA intestine 106 pIVET1tpi ilvA-melA intestine 210 pIVET1 unk ilvA-zjh intestine 213 pIVET1 and8 unk melA-zjh intestine 221 pIVET1 unk melA-zjh intestine 104 pIVET1tolQRA nadA-putA intestine 10 pIVET1 artI nadA-putA intestine 88 pIVET1proS nadC-proA intestine 31 pIVET1 fhuA nadC-proA intestine 28 pIVET1dnaZX proA-purA intestine 55 pIVET1 lon proA-purE intestine 249 pIVET1vacC proA-purE intestine 38 pIVET1 gcvP proU-zgf intestine 79 pIVET1 pgkproU-zgf intestine 101 pIVET2 surE proU-zgf intestine 102 pIVET2 TGI/hybproU-zgf intestine 92 pIVET1 rpiA purG-proU intestine 82 pIVET1 phoPQputA-aroD intestine 91 pIVET1 rbs pyrE-ilvA intestine 195 pIVET1 unkpyrE-ilvA intestine 198 pIVET2 unk pyrE-ilvA intestine 196 pIVET1 unkpyrE-ilvA intestine 111 pIVET2 unk thr-nadC intestine 32 pIVET2flagellar pr tre-zea intestine 75 pIVET1 otsA tre-zea intestine 148pIVET1 unk tre-zea intestine 6 pIVET1 air unmapped intestine 19 pIVET1cysD unmapped intestine 29 pIVET1 fadL unmapped intestine 62 pIVET1 ndkcysA-purG intestine 68 pIVET1 orf211 unmapped intestine 232 pIVET1 unkunmapped intestine 233 pIVET1 unk unmapped intestine 234 pIVET1 unkunmapped intestine 235 pIVET1 unk unmapped intestine 236 pIVET1 unkunmapped intestine 44 pIVET1 hisT zea-cysA intestine 64 pIVET1 nuozea-cysA intestine 157 pIVET1 unk zea-cysA intestine 107 pIVET1 unkzea-cysA intestine 165 pIVET2 unk zea-cysA intestine 252 pIVET1 yejLzea-cysA intestine 39 pIVET1 and 2 gltB zgf-zgi intestine 54 pIVET1 lacAzgf-zgi intestine 85 pIVET1 pnp zgf-zgi intestine 20 pIVET2 cysGzgi-envZ intestine 34 pIVET1 ftsX zgi-envZ intestine 40 pIVET1 glySzgi-envZ intestine 60 pIVET1 mreB zgi-envZ intestine 87 pIVET1 ppizgi-envZ intestine 224 pIVET1 unk zjh-thr intestine 250 pIVET1 valSzjh-thr intestine 125 pIVET2 unk cobD-nadA liver 205 pIVET1 unkilvA-melA liver 57 pIVET1 mdh zgi-envZ liver 43 pIVET1 unk aroD-pyrFliver 126 pIVET8 unk cobD-putA liver 70 pIVET1 orf337 cysA-purG liver247 pIVET2 uraA cysA-purG liver 45 pIVET1 hslU ilvA-melA liver 106pIVET1 tpi ilvA-melA liver 202 pIVET1 unk ilvA-melA liver 12 pIVET1 brnQproA-purE liver 90 pIVET1 purA-like proA-purE liver 73 pIVET2 orfAzea-cysA liver 23 pIVET1 dam/trpS zgi-envZ liver 250 pIVET1 valS zjh-thrliver 138 pIVET8 unk aroD-pyrF macrophage 139 pIVET8 unk aroD-pyrFmacrophage 246 pIVET8 unk aroD-pyrF macrophage 37 pIVET8 galK cobD-nadAmacrophage 124 pIVET8 unk cobD-nadA macrophage 167 pIVET8 unk cysA-purGmacrophage 169 pIVET8 unk cysA-purG macrophage 168 pIVET8 unk cysA-purGmacrophage 72 pIVET8 orf543 ilvA-melA macrophage 84 pIVET8 pmrBilvA-melA macrophage 199 pIVET8 unk ilvA-melA macrophage 200 pIVET8 unkilvA-melA macrophage 207 pIVET8 unk ilvA-melA macrophage 17 pIVET8 cutA2melA-zjh macrophage 58 pIVET8 mgtA melA-zjh macrophage 211 pIVET8 unkmelA-zjh macrophage 212 pIVET8 unk melA-zjh macrophage 50 pIVET8 IS200nadA-putA macrophage 83 pIVET8 phrA nadA-putA macrophage 127 pIVET8 unknadA-putA macrophage 128 pIVET8 unk nadA-putA macrophage 129 pIVET8 unknadA-putA macrophage 98 pIVET8 speE nadC-proA macrophage 94 pIVET8S.t.res/mod proA-purE macrophage 114 pIVET8 unk proA-purE macrophage 115pIVET8 unk proA-purE macrophage 118 pIVET8 unk proA-purE macrophage 116pIVET8 unk proA-purE macrophage 117 pIVET1 unk proA-purE macrophage 178pIVET8 recD proU-zgf macrophage 177 pIVET8 unk proU-zgf macrophage 179pIVET8 unk proU-zgf macrophage 180 pIVET8 unk proU-zgf macrophage 121pIVET8 unk purE-cobD macrophage 33 pIVET8 folD purE-cobD macrophage 174pIVET8 unk purG-proU macrophage 131 pIVET8 unk putA-aroD macrophage 132pIVET8 unk putA-aroD macrophage 105 pIVET8 torA pyrE-ilvA macrophage 194pIVET8 unk pyrE-ilvA macrophage 53 pIVET8 kdsA pyrF-tre macrophage 144pIVET8 unk pyrF-tre macrophage 110 pIVET8 unk thr-nadC macrophage 109pIVET8 unk thr-nadC macrophage 71 pIVET8 orf48 tre-zea macrophage 146pIVET8 unk tre-zea macrophage 228 pIVET8 unk unmapped macrophage 229pIVET8 unk unmapped macrophage 16 pIVET8 col 1 rec. zea-cysA macrophage18 pIVET8 cysA zea-cysA macrophage 66 pIVET8 orf179 zea-cysA macrophage93 pIVET8 rplY zea-cysA macrophage 151 pIVET8 unk zea-cysA macrophage152 pIVET8 unk zea-cysA macrophage 153 pIVET8 unk zea-cysA macrophage155 pIVET8 unk zea-cysA macrophage 154 pIVET8 unk zea-cysA macrophage184 pIVET8 unk zgf-zgi macrophage 185 pIVET8 unk zgf-zgi macrophage 49pIVET8 IS2/IS30 zgi-envZ macrophage 86 pIVET8 ponA zgi-envZ macrophage188 pIVET8 unk zgi-envZ macrophage 222 pIVET8 unk zjh-thr macrophage 223pIVET8 unk zjh-thr macrophage 14 pIVET1 cfa aroD-pyrF spleen 30 pIVET8fdnGHI aroD-pyrF spleen 63 pIVET8 nifj aroD-pyrF spleen 140 pIVET8 unkaroD-pyrF spleen 141 pIVET8 unk aroD-pyrF spleen 142 pIVET8 unkaroD-pyrF spleen 143 pIVET8 unk aroD-pyrF spleen 43 pIVET1 unk aroD-pyrFspleen 251 pIVET1 yehB aroD-pyrF spleen 52 pIVET8 kdpD cobD-nadA spleen67 pIVET1 orf2 cobD-nadA spleen 80 pIVET1 pgm cobD-putA spleen 126pIVET8 unk cobD-putA spleen 13 pIVET1 cadC cysA-purG spleen 70 pIVET1orf337 cysA-purG spleen 69 pIVET1 orf384 cysA-purG spleen 170 pIVET1 unkcysA-purG spleen 171 pIVET8 unk cysA-purG spleen 172 pIVET8 unkcysA-purG spleen 173 pIVET2 unk cysA-purG spleen 168 pIVET8 unkcysA-purG spleen 247 pIVET2 uraA cysA-purG spleen 5 pIVET8 aceKilvA-melA spleen 7 pIVET1 arg.perm. ilvA-melA spleen 45 pIVET1 hslUilvA-melA spleen 48 pIVET8 ilv ilvA-melA spleen 78 pIVET1 pfkA ilvA-melAspleen 106 pIVET1 tpi ilvA-melA spleen 199 pIVET8 unk ilvA-melA spleen200 pIVET1 unk ilvA-melA spleen 201 pIVET1 unk ilvA-melA spleen 203pIVET1 unk ilvA-melA spleen 204 pIVET1 unk ilvA-melA spleen 206 pIVET8unk ilvA-melA spleen 208 pIVET8 unk ilvA-melA spleen 209 pIVET2 unkilvA-melA spleen 202 pIVET1 unk ilvA-melA spleen 207 pIVET8 unkilvA-melA spleen 35 pIVET8 fumB melA-zjh spleen 58 pIVET8 mgtA melA-zjhspleen 214 pIVET8 unk melA-zjh spleen 215 pIVET8 unk melA-zjh spleen 216pIVET8 unk melA-zjh spleen 217 pIVET8 unk melA-zjh spleen 218 pIVET8 unkmelA-zjh spleen 219 pIVET2 unk melA-zjh spleen 220 pIVET1 unk melA-zjhspleen 213 pIVET8 unk melA-zjh spleen 221 pIVET1 unk melA-zjh spleen 248pIVET1 vacB melA-zjh spleen 11 pIVET1 asnS nadA-putA spleen 27 pIVET1deoR nadA-putA spleen 46 pIVET8 hutH nadA-putA spleen 130 pIVET8 unknadA-putA spleen 88 pIVET1 proS nadC-proA spleen 97 pIVET8 speDnadC-proA spleen 98 pIVET8 speE nadC-proA spleen 77 pIVET8 tia-likenadC-proA spleen 112 pIVET1 unk nadC-proA spleen 113 pIVET1 unknadC-proA spleen 12 pIVET1 brnQ proA-purE spleen 55 pIVET1 lon proA-purEspleen 90 pIVET1 purA-like proA-purE spleen 116 pIVET8 unk proA-purEspleen 117 pIVET8 unk proA-purE spleen 119 pIVET1 unk proA-purE spleen120 pIVET8 unk proA-purE spleen 38 pIVET1 gcvP proU-zgf spleen 56 pIVET1lysS proU-zgf spleen 102 pIVET2 TGI/hyb proU-zgf spleen 181 pIVET1 unkproU-zgf spleen 182 pIVET1 unk proU-zgf spleen 183 pIVET8 unk proU-zgfspleen 122 pIVET8 unk purE-cobD spleen 123 pIVET2 unk purE-cobD spleen 4pIVET1 48k prot purG-proU spleen 92 pIVET1 rpiA purG-proU spleen 100pIVET1 srmB purG-proU spleen 22 pIVET1 unk purG-proU spleen 175 pIVET1unk purG-proU spleen 176 pIVET8 unk purG-proU spleen 36 pIVET1 g30kputA-aroD spleen 61 pIVET1 ndh putA-aroD spleen 137 pIVET1 unk putA-aroDspleen 103 pIVET8 unk (cbiJ/thr) putA-pyrF spleen 59 pIVET8 mgtBpyrE-ilvA spleen 91 pIVET1 rbs pyrE-ilvA spleen 105 pIVET8 torApyrE-ilvA spleen 108 pIVET8 uhpB pyrE-ilvA spleen 197 pIVET8 unkpyrE-ilvA spleen 196 pIVET1 unk pyrE-ilvA spleen 41 pIVET1 gtpl pyrF-trespleen 42 pIVET1 hemA pyrF-tre spleen 145 pIVET1 unk pyrF-tre spleen 109pIVET8 unk thr-nadC spleen 32 pIVET2 flagellar pr tre-zea spleen 147pIVET1 unk tre-zea spleen 149 pIVET8 unk tre-zea spleen 150 pIVET8 unktre-zea spleen 62 pIVET1 ndk unmapped spleen 65 pIVET8 orfl.3 unmappedspleen 68 pIVET1 orf211 unmapped spleen 81 pIVET1 phnK unmapped spleen89 pIVET8 pspA unmapped spleen 230 pIVET1 unk unmapped spleen 231 pIVET1unk unmapped spleen 237 pIVET8 unk unmapped spleen 238 pIVET8 unkunmapped spleen 239 pIVET8 unk unmapped spleen 240 pIVET8 unk unmappedspleen 241 pIVET8 unk unmapped spleen 242 pIVET8 unk unmapped spleen 243pIVET8 unk unmapped spleen 244 pIVET8 unk unmapped spleen 245 pIVET8 unkunmapped spleen 99 pIVET8 spvB virulence spleen plasmid 227 pIVET8 unkvirulence spleen plasmid 18 pIVET8 cysA zea-cysA spleen 21 pIVET1 cysKzea-cysA spleen 24 pIVET1 dedB zea-cysA spleen 25 pIVET1 dedE zea-cysAspleen 44 pIVET1 hisT zea-cysA spleen 66 pIVET8 orf179 zea-cysA spleen73 pIVET2 orfA zea-cysA spleen 74 pIVET1 orf_f167 zea-cysA spleen 154pIVET8 unk zea-cysA spleen 156 pIVET1 unk zea-cysA spleen 158 pIVET8 unkzea-cysA spleen 159 pIVET8 unk zea-cysA spleen 160 pIVET8 unk zea-cysAspleen 161 pIVET8 unk zea-cysA spleen 162 pIVET8 unk zea-cysA spleen 163pIVET8 unk zea-cysA spleen 164 pIVET8 unk zea-cysA spleen 166 pIVET1 unkzea-cysA spleen 107 pIVET1 unk zea-cysA spleen 165 pIVET2 unk zea-cysAspleen 252 pIVET1 yejL zea-cysA spleen 253 pIVET8 yohI zea-cysA spleen187 pIVET1 unk zgf-envZ spleen 39 pIVET8 gltB zgf-zgi spleen 47 pIVET8iap zgf-zgi spleen 54 pIVET1 lacA zgf-zgi spleen 185 pIVET8 unk zgf-zgispleen 186 pIVET8 unk zgf-zgi spleen 9 pIVET1 aroK zgi-envZ spleen 20pIVET2 cysG zgi-envZ spleen 23 pIVET1 dam/trpS zgi-envZ spleen 34 pIVET1ftsX zgi-envZ spleen 40 pIVET1 glyS zgi-envZ spleen 51 pIVET1 kblzgi-envZ spleen 60 pIVET1 mreB zgi-envZ spleen 189 pIVET1 unk zgi-envZspleen 190 pIVET8 unk zgi-envZ spleen 191 pIVET8 unk zgi-envZ spleen 192pIVET8 unk zgi-envZ spleen 193 pIVET8 unk zgi-envZ spleen 95 pIVET1 secBzgi-pyrE spleen 15 pIVET1 chvD hom. zjh-thr spleen 26 pIVET1 deoABzjh-thr spleen 96 pIVET1 serB/smp zjh-thr spleen 225 pIVET8 unk zjh-thrspleen 226 pIVET8 unk zjh-thr spleen 224 pIVET1 unk zjh-thr spleen

The examples which follow are not intended to limit the scope of thepresent invention but rather exemplify how the coding sequencesdisclosed are useful in identifying and isolating microbial virulencegenes, the products of which will provide potential targets for thedevelopment of antimicrobial agents or vaccines.

EXAMPLE 1 Identification of Known Genes that are or Have Been Implicatedin Salmonella Virulence

As discussed previously the defined portions of the coding sequences ofthe present invention have been compared to published sequences, andgenes that were both previously known or believed to be implicated inSalmonella virulence have been identified. Several known Salmonella spp.virulence genes have been identified using the coding sequences of thepresent invention, shown in Table 5, thus validating the method andprobes of the present invention.

TABLE 5 Genes of Salmonella Virulence SEQ ID ROLE IN NO. GENE FUNCTIONPATHOGENESIS 82 phoPQ virulence regulator invasion, macrophage survival99 spvB plasmid virulence systemic survival 178  recBCDrecombination/repair macrophage survival 199  pmrAB polymyxin resistanceneutrophil survival 13 cadC lysine decarboxylase acid tolerance 76 oxyRoxidative stress macrophage survival regulator 31 fhuA Fe⁺⁺transportFe⁺⁺accumulation 58/59 mgtA/BC Mg⁺⁺transport Mg⁺⁺sensor

Examples of genes known to be involved in virulence include phoPQ, thetwo-component global regulator of Salmonella spp. virulence involved ininvasion, macrophage survival, and defensin resistance, as well as spvB,a Salmonella plasmid virulence gene whose function is to facilitategrowth at systemic sites of infection. phoPQ gene products are involvedin both early and late stages of infection since phoPQ mutants confer adefect after either oral or intraperitoneal delivery. Accordingly, phoPQin vivo induced fusions were isolated from the spleen after either oralor intraperitoneal infection. In contrast, mutants that lack theSalmonella spp. virulence plasmid are defective in late stages ofinfection; consistent with this infection profile, spvB fusions wereisolated from the spleen after intraperitoneal delivery.

Another class of in vivo induced fusions reside in recBCD, encodingexonuclease V, the primary recombination and repair enzyme in bacteria.recBCD has been shown to be required for full virulence and has beenimplicated in superoxide resistance in cultured macrophages.Correspondingly, the recBCD fusion was isolated from culturedmacrophages, presumably reflecting the pathogen's protectiverecombination and repair response to DNA damage resulting from themacrophage oxidative burst.

The next three classes of in vivo induced genes shown in Table 5 (pmrAB,cadC and oxyR) are in regulatory loci that may be implicated inSalmonella virulence due to the biochemical functions that areassociated with their expression. Examples include pmrAB, atwo-component regulator that controls resistance to cationicantibacterial proteins (CAP) of human neutrophils and to the drug,polymyxin B. The apparent in vivo induction of pmrAB may be involved inresistance to similar, as yet undefined, murine macrophage-derivedantibacterial proteins

cadC is an in vivo induced regulatory locus that controls lysinedecarboxylation. These fusions were isolated from the intestine after anoral infection and from the spleen after an intraperitoneal infection.Decarboxylation of basic amino acids produces primary amines which mayincrease the pH of host cell organelles such as the phagosome. The factthat cadC was isolated from different host tissues suggests that it mayfunction to increase the pH of several different host cell organelles(e.g., in response to the low pH of the stomach or phagosome). Moreover,CadC is topologically similar to ToxR, the global regulator of virulencein Vibrio cholerae. Both cadC and toxR respond to low pH and mediacomposition, but it is not known whether toxR regulates polyaminesynthesis in Vibrio cholerae or whether cadC regulates other virulencegenes in Salmonella spp.

Last, oxyR, a regulator of the oxidative stress response was recoveredfrom the mouse intestine, a tissue which is thought to be relativelyanaerobic. The apparent in vivo induction of oxyR may be in response tothe oxidative burst of macrophages present in mucosal associatedlymphoid tissue (MALT) that line the intestinal epithelium.Alternatively, this may be a developmental response: oxyR may beinducing bacterial oxidative protective systems within the lumen of theintestine in anticipation of encountering macrophages in some laterstage in the infection cycle, such as in the blood or spleen.

EXAMPLE 2 Virulence Genes of Other Pathogens Not Previously Known toExist in Salmonella spp.

The coding sequence of the present invention have been compared topublished sequences and virulence genes of other pathogens notpreviously known to exist in Salmonella spp. have been identified, seeTable 6.

TABLE 6 Virulence Genes of Other Pathogens SEQ. ROLE IN ID NO. GENEFUNCTION PATHOGENESIS 248/249 vacB/C ipa/icsA expressioninvasion/intercellular spread Shigella spp.; EIEC 254 cpxA virFexpression invasion/intercellular spread Shigella spp. 251 yehB pilinassembly adherence K. pneumonia; H. influenzae; EIEC  77 tia gutepithelial invasion adherence; invasion EIEC  15 chvD virG expression(plant signal transduction virulence A. tumefaciens

In vivo induced fusions to virulence genes of other pathogens notpreviously known to exist in Salmonella spp. and enteroinvasive E. coli(EIEC). vacB mutants are defective in the synthesis of invasion plasmidantigens (ipa) and intercellular spread (ics) gene products, which arerequired for invasion and lateral spread within host cells. The affectedgenes are transcribed at normal levels but the corresponding proteinsare not detected. vacB fusions were isolated from the spleen after anoral or intraperitoneal infection, suggesting that vacB is needed atboth early and late stages of infection, possibly for invasion of theintestinal epithelium and for invasion at systemic sites of infection(e.g., invasion of splenic macrophages in a manner that may not activatephagocyte killing mechanisms). vacC is homologous to E. coli tgt, whichencodes a transglycosylase that modifies tRNA molecules. In contrast tovacB, Shigella spp. vacC mutants show reduced transcription of the ipagenes; they do not form plaques on cultured mammalian cells and exhibitreduced survival in stationary phase. Some tRNA modifications (encodedby miaA and tgt) are sensitive to environmental signals such as Fe⁺⁺,O₂, and growth state. The in vivo induction of environmentally-sensitivetRNA modifications may contribute to the changes in bacterial geneexpression (by attenuation) and/or protein synthesis (by altered codonpreference) that may occur in host tissues (note that [chorismate],produced by a metabolic in vivo induced gene, aroK, is also involved intRNA modification).

A third class of fusions map to the E. coli yehB locus, which hassequence similarity to proteins involved in pilin assembly in manypathogens, including mrkC of Klebsiella pneumoniae hifC of Haemophilusinfluenzae, and CS3 pilin assembly components of enterotoxigenicEscherichia coli. yehB fusions were isolated from the spleen after anintraperitoneal infection and may represent a new class of Salmonellaspp. surface properties that are induced at systemic sites of infection.

Recently, it has been shown that Pseudomonas aeruginosa encodesvirulence factors that are required for infection of both plants andanimals. Similarly, one class of in vivo induced fusions isolated fromthe spleen after an oral infection resides in a gene that has amino acidsequence identity to chvD, a chromosomal virulence gene involved insignal transduction in the plant pathogen, A. tumefaciens. Underconditions of low pH and phosphate starvation, chvD is required for theinduction of transcription of virG, the regulatory component of thevirA/G two-component regulatory system in A. tumefaciens. The apparentin vivo induction of a chvD homolog in S. typhimurium may representanother example of a sensory virulence determinant shared by animal andplant pathogens.

EXAMPLE 3 Unknown Genes

Unknown coding regions of promoters that are induced in vivo have alsobeen identified and are represented by SEQ ID NOS. 22, 43, 103, 107,109-177 and 179-253.

One can imagine that pathogens possess many functions that are requiredduring infection, but are not easily detected on laboratory media oridentified by biochemical assay. The coding sequences of the presentinvention allows for the identification of previously unknown genes andprovides a means to associate them with a phenotype, induction in thehost. Indeed, the functions of >40% of the in vivo induced genes areunknown. The members of this class have either no homology with the DNAdata base or encode open reading frames with no assigned function.Defined regions of the coding sequences of the present invention sharinghomology to unknown genes have been isolated from all IVET vectors(pIVET1, 2, and 8) made according to the present invention and routes ofdelivery (oral, intraperitoneal) and host tissues (intestine, spleen,liver) tested. These unknown fusions have been mapped (shown in Table 4)to determine whether they cluster to a specific region of the S.typhimurium chromosome possibly functioning in the same stage ofpathogenesis. Thus, by combining the knowledge of the in vivo inductionphenotype, the host tissue from which the coding sequences of thepresent invention were recovered, and the chromosomal map positions, onehas the means to begin investigating not only novel virulence factorsbut also bacterial sensory and biochemical pathways that remainundefined. Coding sequences of the present invention having homology tounknown genes are found throughout the chromosome. However, clusters ofin vivo induced fusions in adjacent genes do occur in some locations.For example, two unknown in vivo induced fusions reside in thepreviously reported open reading frames, orf384 and orf337, in vivoinduced A (SEQ ID NO. 69) and B (SEQ ID NO. 70) lie in transcriptionunits that are highly linked to the metabolic in vivo induced gene, ndkdiscussed further below.

EXAMPLE 4 Method of Using the Coding Sequences of the Present Inventionto Identify Genes Involved in Virulence.

Each in vivo induced clone can be used to isolate mutations in the geneidentified by sequence analysis. Insertion mutations generated bytransposable elements (Mahan, et al., J. of Bacteriol, 175(21):7086-7091(1993)) that disrupt an operon will reduce the transcription of the lacgene. These insertions will have a light blue color on LB platessupplemented with X-gal. Some of these will be insertions in the in vivoinduced gene, identified by sequence analysis. In addition, genes thatare downstream of the operon promoter, but proximal to the ivi lacfusion may be disrupted; this will result in reduced transcription ofthe lac genes, again resulting in a light blue phenotype on X-galcontaining plates. Sequence analysis of the DNA surrounding theinsertion will identify new genes cotranscribed with the original invivo induced gene.

As an example, tia (SEQ ID NO. 77) is an in vivo induced gene identifiedby the method of the present invention which encodes a product withprotein sequence similarity (as translated from the DNA sequence) to anE. coli protein that directs invasion of gut epithelial cells in tissueculture cells. The coding sequence of the present invention containingthe tia fusion was used to isolate insertions that disrupt the tiacoding sequence by looking for transposon insertions that reduce thetranscription of the lac gene. Among the mutations isolated by thismethod are transposon insertions in tia and also in a gene promoterproximal to tia. This gene, having the partially defined sequence

3′-CGCTGTCCTG GTGTTAAGAC TTTGCTTAAA TCAAAATAAT ATTTAACCCG (SEQ ID NO.255)    ATAATAGCGA GCCTGTTGTT CTATGTTACT GAAGGCTGCA AGCTGCTGTT   TTACGGCGGC GTCATCCCAT TTACCGGATT TAATCACCTC TATCAGCGCA    CCGTCTTTAATTCCCTTCAT AGAAATCTGA CTGACGTCGG TTTCCAGTTG    TTGGTGAAGT TTTTTGATCCGGGTAATCTG ATCGTTTGTC AGCTTCAGAT    GCTGGACAAT AGGATCCTGG GCGGGCAGGGGGAGGATTGG GGACAGCGTG    CAAGCAAAAG AAACGCGCAG AGTCGCTGCA GTAAGTGGGCATACGTTT-5′

encodes a protein product with sequence similarities to pfEMP, a proteinencoded by Plasmodium falciparum (the causative agent of malaria) duringinfection of red blood cells. Thus, the identified sequences of thepresent invention described here can and do lead to the identificationof other genes specifically induced by the bacterium during infection.Each in vivo induced clone contains one or more genes transcribed from asingle promoter, thus insertion mutations that are proximal to theoperon promoter are capable of disrupting and reducing the transcriptionof distally positioned genes including the lac gene. In the alternativeto using this insertional mutagenesis technique to identify othernon-sequential genes that are cotranscribed with the genes for whichpartial sequences have been defined (SEQ ID NOS. 4-254), these definedsequences may also be used as probes to identify cotranscribed genes.Defined sequences identified by (SEQ ID NOS. 4-254) or portions thereofcan be used to prime the synthesis of a cDNA library from totalbacterial mRNA. There are many routes to a cDNA library; howeverregardless of the pathway the first step is the synthesis of a DNAstrand complimentary to the mRNA sequence. The reaction requirestemplate RNA, a complementary primer, reverse transcriptase, anddeoxyribonucleoside triphosphates, see Maniatis, Id. or S. Berger, etal., Guide lo Molecular Cloning Techniques, 152:307-389 (1987). ThiscDNA will contain the transcribed sequence from the mRNA start site tothe priming site. This cDNA can be used to detect clones that overlapthis region of DNA by Southern Hybridization. From those clones, DNAfragments can be used as probes in Northern Hybridization against totalmRNA. Each DNA fragment that hybridizes to the mRNA defined by theoriginal cDNA can be inferred to contain sequences cotranscribed withthe original in vivo induced gene sequence defined here. Thus, eachcoding sequence of the present invention can be used to isolate andidentify additional genes that are expressed during infection, each ofwhich may encode products useful for the development of antibioticsand/or vaccines. In the alternative, the defined sequences (SEQ ID NOS.4-254) may be used to probe DNA libraries to identify and studyhomologous regions of interest.

EXAMPLE 5 Method of Using the Coding Sequences of the Present Inventionto Identify Genes Within the Same Operon

As discussed above in Example 4, in vivo induced genes may be identifiedby the defined regions of the coding sequences of the present inventionthat are relatively short (70-400 bp). Some bacterial operons are large,greater than 10 kilobases in length. It is reasonable to expecttherefore that multiple fusions in the same operon might be recovered bythe IVET selection. Three in vivo induced fusions (ndk, SEQ ID NO. 62;orf384, SEQ ID NO. 69; and orf337, SEQ ID NO. 70) are in genes known tobe near each other on the E. coli chromosome and transcribed in the samedirection. Insertion mutations that reduce the expression of the lacgene in the orf337 synthetic operon were isolated. One transposoninsertion, which disrupts the coding sequence of ndk, reduces theexpression of the downstream or4384 lac fusion, indicating that allthree genes, ndk, orf384 and orf337, are transcribed as a unit and mayhave related functions as they relate to virulence. In this way, fusionsto unknown genes that lie close to one another, as determined bymapping, can be analyzed for a common promoter. The existence of such apromoter and the study of its regulation may provide clues to the roleof each in vivo induced gene transcribed or cotranscribed with thecoding sequences of the present invention during microbial infection ofa host.

EXAMPLE 6 Method of Using the Coding Sequences of the Present Inventionto Identify Environmental or Host Signals that Coordinate and RegulateVirulence Genes

Because the expression of each in vivo induced (ivi) fusion can beeasily assayed by measuring the activity of the lac reporter gene, thesignals that regulate ivi genes in vivo can be determined. If there aremolecules present in host tissues that induce the expression of ivigenes the activity of those molecules can be assayed by their effect onthe transcription of the lac gene in the ivi construct. Extracts of hosttissues can be used to look for host molecules that induce theexpression of ivi lac fusions. Purification of this activity can befurther monitored by repeated assays. In this way, host compounds, e.g.cytokines or other molecules which may be used as antibacterial drugscan be identified. Genes have been identified that respond toconcentrations of Mg⁺⁺ and/or pH, e.g. SEQ ID NOS. 77 and 84.

EXAMPLE 7 Method of Using the Coding Sequences of the Present Inventionto Distinguish Salmonella from other Microbes

Dissimilarities in genome composition within a species highlight thefunctions that distinguish one serovar from another and may define theaspects of their life-style that selectively maintain individualserovars. Using in vivo expression technology (IVET), 5Salmonella-specific in vivo induced genes have been identified inregions of aberrant G+C content that distinguish host adapted fromnon-host adapted serovars. Many of the sequences within these regionsencode adhesin and invasin-like proteins. These in vivo selectedsequences contribute to the molecular events that dictate evolution ofspecies, host range, tissue tropism, and pathogenicity of entericbacteria.

Insights into the molecular basis of speciation are derived from theidentification of selectively maintained functions that confer uponnatural populations the ability to occupy distinct niches. Within thecontext of pathogenesis, such sequence disparities contribute to theunique capabilities that allow pathogens to colonize host sitesinaccessible to commensal organisms. In many cases, thesesequence-specific genes reside on extra-chromosomal elements (e.g.,plasmids or phages) or specialized regions of the chromosome termedpathogenicity islands. These virulence modules are presumed to have beenacquired by horizontal transfer as evidenced by their atypical G+Ccontent and codon usage.

The in vivo induced (ivi) genes of the present invention are poorlyexpressed on laboratory medium but exhibit relatively elevated levels ofexpression in host tissues. As will be discussed in further detail belowmany of these ivi genes exhibit an atypical sequence composition anddefine Salmonella-specific regions of the chromosome that distinguishbroad host range from host adapted serovars.

Atypical sequence composition of ivi genes. To identify Salmonellaregions of atypical sequence composition that confer novel virulencefunctions, a collection of >100 S. typhimurium ivi genes discussed byHeithoff, D. H., et al., Proc. Natl. Acad. Sci. U.S.A., 94:934-939(1997) was screened for aberrant nucleotide content (<49 or >59% G+C)and for absence of sequence homology in the DNA data base. The subset ofthese ivi genes that answered these criteria were used as molecularprobes to hybridize against genomic DNA isolated from a set of entericpathogens, including four Salmonella serovars of differing host rangeand tissue tropism.

Table 7 below shows that DNA's prepared from 5 unlinked ivi geneshybridize strongly to genomic DNA prepared from one or more Salmonellaserovars and not to all other enteric pathogens tested (>15 otherenteric species or serovars). These Salmonella-specific regions fallinto three distinct classes. Class I sequences (identified by Seq. I.D.#77, #217 and #180) hybridize to all Salmonella serovars tested whichare listed in order of increased host specificity; class II (identifiedby Seq. I.D. #170) do not hybridize to host adapted serovars (e.g., S.typhi); class III (identified by Seq. I.D. #22) hybridize only tobroad-host range serovars (e.g., S. newport) and not to those that arehost-adapted (e.g., S. typhi) or preferentially infect a particularspecies (e.g., S. choleraesuis).

TABLE 7 #77 #217 #180 #170 #22 S. typhimurium ++++ ++++ ++++ ++++ ++++S. newport ++++ ++++ ++++ ++++ ++++ S. choleraesuis ++++ ++++ ++++ ++++− S. typhi ++++ ++++ ++++ − − EPEC − − − − − S. flexneri − − − + + K.pneumoniae − − − + +

The probes are referred to by SEQ ID#. ++++refers to stronghybridization;—refers to no detectable hybridization.

Salmonella-specific virulence regions. Partial sequence analysis hasidentified several virulence-like genes in these Salmonella-specificregions. Examples include many adhesin like functions: specifically theSeq. I.D. #77 region, which contains homologues to (i) ETEC tia(enterotoxigenic invasion locus A), which is involved in attachment toand invasion of gut epithelial cells, (ii) a family of afimbrialadhesins of enteropathogenic bacterial (Yersinia enterocolitica myfb andmyfc, a chaperone and usher, respectively, and (iii) Staphylococcusepidermidis intercellular adhesin molecule (icaB;). Similarly, Seq. I.D.#180 region contains homologues to (i) uropathogenic E. colipyelonephritis associated pili (papC).

Disparities in genome composition reveal the genetic events of gene lossand/or horizontal transfer that are fundamental aspect of speciation.Accordingly, the Salmonella-specific regions comprise a fossil record ofevents that have lead to the evolution of distinct species and serovars.These species-specific regions can be used as signature tags for rapidand sensitive detection of a given infectious organism. Such regions notonly distinguish one pathogen from another but also point to thefunctions involved in host/pathogen interactions that lead to hostspecificity and tissue tropism within and between species, i.e., thefunctions that contribute to specific disease or carrier state caused bya given serovar in a given host.

The foregoing description is considered as illustrative only of theprinciples of the invention. Furthermore, since numerous modificationsand changes will readily occur to those skilled in the art, it is notdesired to limit the invention to the exact construction and processshown as described above. Accordingly, all suitable modifications andequivalents may be resorted to falling within the scope of the inventionas defined by the claims which follow.

255 18 base pairs nucleic acid single linear NO NO DNA (other) 1CATTGGGTGC CCAGTACG 18 18 base pairs nucleic acid single linear NO NODNA (other) 2 TGTGCCTTCG TCGAGCAC 18 18 base pairs nucleic acid singlelinear NO NO DNA (other) 3 CAACGGTGGT ATATCCAG 18 390 base pairs nucleicacid single linear NO YES DNA (other) 4 GATCCGGATG GAATGGCTCC AGCGCGTCGGTTTTCTCGCC GACACCGAGG 50 AATTTAATCG GCTTGCCGGT GATATGACGA ATAGAGAGCGCCGCACCGCC 100 ACGCGCATCA CCATCAACTT TGGTCAGCAC CACGCCGGTT AACGGCAGCG150 CTTCGTTAAA GGCTTTTGCG GTATTCGCCG CATCCTGACC GGTCATCGCA 200TCGACGACAA ACAGCGTTTC TACTGGCTTG ATAGAAGCGT GGACCTGTTT 250 GATTTCGTCCATCATCGCTT CGTCAACATG CAGACGACCG GCGGTATCCA 300 CCAGCAGCAC GTCGTAGAATTTGAGCTGCT TCTTGGCGGT TGACAGTATC 350 ACGTTCTGCG AAATCAGACG GAGAATCACGCAATTGTACA 390 238 base pairs nucleic acid single linear NO YES DNA(other) 5 GATCATAGAG GTGGATACGG CTTTTCAACG CCTGTTGGAC GGCGTGCCAG 50TCGGCCTGTT CAAAACGCTG CTGCGCGCCG GAAGTCACTT CCAGAAATCG 100 ACCATACTGCGCGTCAAAGC CTTGCAGGAT GGTTTGAGCA ATCAGTAATT 150 CCAGGCCACG CGGCATTTTTTTACCTCATC CGGCACCACG TCATGCCGGA 200 TGCGCGTTCG CTTATCCGGC CTACGCTATCTGTAGGCC 238 309 base pairs nucleic acid single linear NO YES DNA(other) 6 GATCGAGAGG ATGCGGTGGT GGATGCGCAT ATTACCGGAT GACGGCGTGA 50ACGTGTTATG CGGCCTACCA GCCCAATGCG CGATACCAAG CCGGATAAGC 100 CGCCAACGCCCACCCCGGCC CCGCCGCGTA TTTAATCAAG TTATTACCTT 150 TGATCGCACC CTTGAGGTCAGGCGCGTGAT AAGTTCGTAA GCACTTACTT 200 TTGTCATTTC AGCGATACGT TCAACCGGCAGACTTACCCA TAGACACGAT 250 CGCGGTATCT CGGTTGCCAA TTCGAATCTA TCCATGGACGCGACATCGAC 300 TACGACATT 309 362 base pairs nucleic acid single linearNO YES DNA (other) 7 GATCCGTTTT GACCATCCCG TGTTTGGTCG AAACCGTGCAGCCTTCTACC 50 AGCGGCAGTA AGTCGGGCTG TACGCCTTCG CGTGAAACAT CCGGGCCGGA 100GGTTTCGGCA CGCTTACAAA TTTTGTCAAT TTCATCGATA AACACGATGC 150 CGTGCTGTTCAACCGCGTCG ATAGGTCCTG TTTCAGCTCT TCCGGGTTGA 200 CCAGTTTAGC AGCCTCTTCTTCAACCAACA GTTTCATCGC GTCTTTAATT 250 TTCAGCTTAC GGGTTTCTGT TTCTGACCGCCCAGGTTCTG GAACATAGAC 300 TGCACTGCTG TCATCTCTCA TGCCGAGCCA TATCTCTAGCCATCGGCGCA 350 GTATTGACTT TA 362 206 base pairs nucleic acid singlelinear NO YES DNA (other) 8 GATCAAGAAT GTGTTCTCCC AGCGCATCCT TGATGGTTTCTCCCAGCACC 50 TTGCCGAGCA TACTGACATT ACTAGCAACG CGGAATATTG TTCGTTCATA 100TGCCCCCAGA CGCCCCATCT TTAATGTAAT TGCCCTGTCT CTTTCATGCC 150 ACAGCGCAGTGGCTGCGTGC GTATGCAGTT ATGCGAATGC TCGTGCTGCG 200 ACTAAT 206 250 basepairs nucleic acid single linear NO YES DNA (other) 9 GATCGTCGGTGCGAATGGTG ACGTCGGCAA TCTCTTCGTA CAGCGGATTG 50 CGTTCGTTAG CCAGCGCTTCCAGAACTTCG CGAGGCGGTG CTTCAACCTG 100 CAACAGCGGG CGTTTTTTAT CACGCTGCGTGCGGCAGTTG TTTTTCGATC 150 GGTCGTTTCA AGGTAGACCA CGACGCACGG CGAGAGACGGTTACGGTTTC 200 ACAATTTTAC AGAGCCACAT CGGAACACAC ATACCTTTAT ATCTATACTT250 176 base pairs nucleic acid single linear NO YES DNA (other) 10GATCCAGGCT TCGCGTTCTG ATAGCTGTCA TACGGTACGG TGGTGATTTC 50 CGGATGCTTATCCATGATGA ATTTCTGGTG TCGTCGTACC GTTCTGTACG 100 CCGACTTTCT TGCCTTTCAGTTGATCAACG CTGGTGTATT GCCTGCTGAC 150 CACGAACAGC GTGAGTAGGG TATATG 176312 base pairs nucleic acid single linear NO YES DNA (other) 11GATCTTCCGC CCAGCCTGCG ACTTCTACTT TCGAGGCCTG GATTTCGAAA 50 CTTTGCCCCTGTGCCGGCGA CGCGACAACC TTACCTGTTA CTACCACGGA 100 GCAGCCTGTC GACAGGTGTAATACTTCTTC ATTATAATTG GGCAGAGAAT 150 TATTAATGAC AGCCTGTACA GGATCAAAGCAGGAGCCGTC ATAAACGGCG 200 AGGAAGGAGA TGTCCAGCTT TTGAATCTCG GTCGGGTACGACCCATCCCG 250 CGCAGTGACT TCTTGGTCAA CGGCTACTGG CCTGGAGTAC TGCGGCTACG300 GCACACGTCA TA 312 289 base pairs nucleic acid single linear NO YESDNA (other) 12 GATCCCAGAT AATCGCCAGG ATCACCATCA CCACCGTTGG CATCAACCAA 50GCCAGTCCCT GTTCCGCCAG CGCAAACGCT GACTCCAGGC TGGCAGCATA 100 TCGCCGAAGGATGCTTTGAT GCCGTCAAGG ATACCAAAAA GCAGACTGAT 150 AAACATGGCC GGCGCCGATGATACGGGTGG AATTATGCCA CCATGAGCGG 200 GTAAAACTTA ATACAACCAG TGCGATACACGGCGGATAGA TAGCGTCATG 250 ACGGAATTGG AGATTATCAG ATCGCTCAGT CGAGGTTGA 289240 base pairs nucleic acid single linear NO YES DNA (other) 13GATCAATAAT GTTATCCCGG CTTAACACTT CATCCGGGTG ATGCGCAAAA 50 TACATCAGAAGATCGATCAG CCGTGGTTCA AGAGTAATCT GGCGTCCCTG 100 ACGACTGATC TGACCAACAGAAGGTATAAC CAGCCACTCT CCAATGCGTA 150 CAACAGGTTG CTGCATAAAA AGATGCCTAACGAGCTAAGT CATACGTATA 200 TACACGATTG CACAGACTTT TATCCTTTGT AAGAAGCTAA240 260 base pairs nucleic acid single linear NO YES DNA (other) 14GATCAGAACC TTAAAACAGC GTAGACACTT TTTTGGCTTT GTGAGAAATC 50 CACGGACAATTCCGCGAGCC AGTTATCGAC GTAGAACAGA GGAAGGGAGG 100 AGCCCTTGCC GAAAAGGCCATCCCATGGTG AATCGGGAAC GCTCCGGTTC 150 CCGTTAATGC CTAATAATTA TCGTAATATAAACAACCGGA AATCAGTATA 200 GGCCGCAATT TTGACGATTC ACCGAAATTG TTAGCGTGCTAATTACAGAG 250 TACAGTTAGT 260 314 base pairs nucleic acid single linearNO YES DNA (other) 15 GATCGGCATA CAGCGCGTAC ACTTCATCCA GACGTTTGAGGGCGTTAACC 50 ACTTCCGAAA CGGCCTCTTC AATCGACTCG CGTACCGTGT GTTCCGGGTT 100TAGCTGAGGT TCCTGCGGCA GGTAGCCAAT CTTAATGCCG GGCTGCGGGC 150 GCGCTTCGCCCTCGATATCT TTATCGAGCC CCGCCATGAT GCGCAGCAGG 200 GTAGACTTAC CGGCGCCGTTAAGGCCCAGC ACATCCGATT TGGGCCCAGG 250 AGAGCTCAGG CAGATGTTTC AGATATGACGTTCAGACACT GCGAACCGAT 300 GCTGATAGAT GAGC 314 350 base pairs nucleicacid single linear NO YES DNA (other) 16 GATCGCCATT CTGCTAACGACTCTGACGCT GGCGCTGCTC TCCAGGCTGC 50 ATCGGTTATA ACATTCTGGC GACACGGGCAAAACGCGGCT GTCGCCAGTC 100 TCTGTCAGAA ACGGTAATCC ACCGCCATAA AGTAACGACGTCCGTCTTCG 150 GTATAACCGT AGTCGTCGCG TTTGAGATCT TTATCGCCCA CGTTCAGAAC200 GCCCGCACGC AGTTTAACGT TTTTCGTCGC CTGCCATGCC GCGCCGGTAT 250CCCAGACCAC GTACCCGCCC GGCGTTTTTC GCTGTTTGCC TCTGTCGGCC 300 CGCTTACGCCGGTATAATTC CTGATACGTA GATGACAGTT GAGCTGACCG 350 336 base pairs nucleicacid single linear NO YES DNA (other) 17 GATCGTGCAA ATGCGCGCTAAAGGTGGCGG CGTCCATAAA GCCGGTGACT 50 CGCGATTGCG GCTGTTCCTG GCCTTGGGTATTAAAGAACA GAATGGTGGG 100 CAGCCCGAGG ACTTGCAGAT GCTTTAACAG CGCGACATCCTGCGCATTGT 150 TAGCGGTGAC GTTAGCCTGC AAGAGCACCG TGTCGCCGAG CGCCTGCTGG200 ACCCGCGGAT CGCTGAAGGT ATACTTTTCA AACTCTTTTA CAGGCCACGC 250ACCAGTCGGC GTAGAAATCA GCATAACGGT TTGCCTTTGG CCTGCGCCTG 300 ATTGAGTTCATCCACGTAGA ATAGCCGTGA ATTGAG 336 286 base pairs nucleic acid singlelinear NO YES DNA (other) 18 GATCCGCGAG GTGCGCCAGT TGCACCATCT CCAGCAATTGCGTCACTTTG 50 TTTTAATCGC CGCCGCCGCA GTTGGGCGTC GCTCGCGCAG ACCGTAGCCA 100AAAGCGATGT TGTCAAACAC CGTCATATGG CGAAACAGCG CATAGTGCTG 150 AAAACACAAAACCGACTTTA CCTACTGGTG AGGCGCTAAC GTCGTACGTG 200 GAAACGATAT ACCGTGGACTGTGTCAGCCC GGCAATAATC CCGGCTGTTT 250 GCGGAACTAC GCACAGGACA TTGCGAGATATTACGG 286 325 base pairs nucleic acid single linear NO YES DNA (other)19 GATCGCGAAA GGCGTACATC TCACGGAATT TCCAACCGGT ATCAACGTGC 50 AATAGCGGGAACGGCAACGT ACCCGGATAA AACGCCTTAC GCGCCAGATG 100 CAGCATGACG CTGGAGTCTTTACCAATGGA GTACAGCATG ACCGGATTAG 150 CGAATTCCGC TGCCACTTCA CGATAATGTGATACTTCGCA CAGTTGCGCA 200 GTGGTGAGTC GTTTTGATCA TACGTCTTTG CATCGTTTTGCTAACTGATA 250 CGACTAGGCG GTATATCGAT GATGTGTCTA GATACGCACA TCACACCGAT300 CCTGCAATTC ACGTACACGA TCTGC 325 200 base pairs nucleic acid singlelinear NO YES DNA (other) 20 GATCAGGTGC GGTCGGTAAT TGACAAAATA TGGGCAAATGGCCACGACAT 50 TACCCCTTAA TTGATTGGCA GCAGCTCGTG GCTGATTGAT TTTAGCCGGA 100GCCGGACGCT CCGATTTTGG CGTCAGATAC CAATAACCCA ATCCATGAAT 150 ACACACGACAAGTATACGGG TTACACACAG TATACATCGC AGATCGCTGT 200 264 base pairs nucleicacid single linear NO YES DNA (other) 21 GATCGGTTTT ACCCTTCGTCCCTTTGATAT AACGCGTGAC GCCGTTAACG 50 TACCGCCAGT GCCGACGCCG AGATAAACACATCCACCTGA CCATCGGTCT 100 CCAGAGTTTC CGGGCCGGTG GTTTTTCATG GATTCTCGGGTTGGCAGGGT 150 TGCTGAACTG CTGGAGCAGG AGATATTTTT GCGGATCCGT GGCGACAATT200 TCTTCGGCTT TCTTGAATAG CGCCTTCATC CTGGCCTTGT CAGCACCAGA 250TTGGCTATGC TTAG 264 324 base pairs nucleic acid single linear NO YES DNA(other) 22 GATCAGAATC TATGTTGTCA CAGATTAATA GTTTATTATA TATTTCATCA 50AAATAATCGA CGTCAAGTTC TTTGTTTTTA TTTAGAGTGA ATACTTCCTG 100 TCGTTTTTTATCGTTTACAT AATCGACTAC CGTAACTGCA ACATTCTTAT 150 TTTTTTGTTT CTCTATACATAGTAATATGG TGTCAAGTTC AAATTTTATT 200 TCTTCAAATC GCAAATCAAA GAAAAAATCTATATTTTTAT TTAAAATCGT 250 TGTCAATTAT CTTTAAAACG ATGTTTTACG TAACATTGTCGTATATATCG 300 TCTGAGTCTA ATCAATATCA TAGT 324 276 base pairs nucleicacid single linear NO YES DNA (other) 23 GATCTTCGCC TACCGGCACCAGATTGGTTT GGTACAACAG AATGTCTGCC 50 GCCATCAGCA CCGGGTAATC AAACAGGCCGGCGTTAATGT TTTCCGCATA 100 GNNNCAGATT TATCTTTAAA CTGCGTCATA CGGCTCAGCTCGCCGAAATA 150 GGTATAGCAG TTCAGCGCCC AGCCAAGCTG CGCATGTTCC GGCACATGGG200 ACTGAACGAA AATAGTGCTC TTTTAGGATC ATACCACATG CCAGGTACAG 250NNAGATTCCA GGCGTTTACG TAGTGT 276 329 base pairs nucleic acid singlelinear NO YES DNA (other) 24 GATCCGGCGC CGGAGCCACC ACGCCTTCAC GCGGGGCTCCGGGTTCGGCG 50 CGGGCAGATT CATCAGCTTC GCCAGAATGC TCGCCAGCTT CAGGCGCATT 100TCCGGGCGGC GGACTATCAT ATCAATAGCG CCTTTTTCGA TCAGGAACTC 150 ACTGCGCTGGAATCCTGGCG GCAGTTTTTC GCGAACGGTC TGTTCGATAA 200 CGCGCGGGCC GGCGAAGAATCGAGACTTTT GGCTCGGCGA TGTTGAGATC 250 GCCAGCATCG CAAAACTGGC GGAAAAGGCCCATTGTCGAT CGTACTACGA 300 AATGTAGGGC AGACGCTCTG CATTTAGAC 329 222 basepairs nucleic acid single linear NO YES DNA (other) 25 GATCCCTAACACCCGGTCAG TTCCCGACAG GCCGGTCTTT TCTACTAGCT 50 GACCTATCAC AAAATTCACGACAGCGCCGA TCGATAAGCG TCGCGATAAA 100 CAGTACCGCG ATACGAATTC CCATTACGAACCAGTTCGTC TTCAAAGCCC 150 GTAAACCAGA CAGACAGGTA AGTGTAGTAG TGACTGGCGACAAAGAAGCA 200 CACCCACGTA CCAGCATACG TC 222 166 base pairs nucleic acidsingle linear NO YES DNA (other) 26 GATCAGTATA CAACTATCAG TAATTCGACGATAGACCGAA GTGTGCTTGC 50 TGGCGCTTTA TCGTCAAGGA TAATTGCCGC TTTGACGGCCTTCGCGCTTC 100 CTGCCAACTG GCTTCGTCTT TGTGCATGAA TCACCGCCAG CGGCTCTGCC150 GCTCGATNTG TCGATC 166 333 base pairs nucleic acid single linear NOYES DNA (other) 27 GATCGCTTAA CAGATAATGA CTGGCGCTGC GGGGCTCCAGTACGATATAG 50 CCGCCTAGCA ACACGACAGG CGCGCTTTTA TGGTTCAGGT CGCGACGAAT 100GGTCATTTCA GAGACGCCCA ACAGGGTCGC GGCTTCTTTA AGATGAAGTT 150 TATCGCTGCGTTTTAAGGCC TGCAGCAATT GACCAATAGC GTCGTCGCTC 200 GGCTTTCCAT AGTTCCCCTGGAGAGTTAAA TAAGCGCTCC GCACCATACA 250 GAGCGCTTAA TATTACTCTT TTTTGCGCTATTTAGTCACG TACCCAGCCT 300 TTTCGAATGG GCAATGCAAC AGAACGTACA CGT 333 221base pairs nucleic acid single linear NO YES DNA (other) 28 GATCGCGCTCAATCGCTTCC GCCGCCAGTT TAGCCGCCAG CTCCGGCGTT 50 TTTTCATGCA CCAGAGCTTTCTTAAGCGCT TTTGGCGTAG CACCACTTCT 100 TTGGTTTGTA CTACCGGCGT GGTGGCCTTCCAGCGATAAG CCTCTTTCTT 150 TACTGGCGGT TTCCAGCGGG ACGGNGGGNT GTACNNTCCGAAACCGAGGA 200 GCGTCAGNAG AGTTATTACG G 221 368 base pairs nucleic acidsingle linear NO YES DNA (other) 29 GATCGTCGTA CCGCCAACCG AGCCGCCGGGTATGTGTCGT TAAACTCTGT 50 CGCCAGACCA TAGTTAGAGG TAATAGAAGC CCCCCAGCCAAACTGGTCGT 100 TAATCGGGGC GACAAAATGG ACGTTCGGCA CCCAGGCCGT CAGCGCGATG150 TTATCCGCAT CTAACGTCCG ACGAGATGGC GATGTCCCGC TAATATTAAC 200ATCAGGATCA ATATAAACGC GCCCGCTGAA AACGTCGGGC GGTCAAACAT 250 GTATTACGCGGGTGCGCTAC GTACGCATCA TCTGCGATGC GCTCACGATA 300 GCGCAGCAGA GAGAATCGTACTGAGCTCGC GACAGTGTGA TGTCGATCGG 350 ATCGCGCTTT GCAGTTTG 368 288 basepairs nucleic acid single linear NO YES DNA (other) 30 GATCTCCACAAACTGTTCCG GCTGAGCGAT AGCTTAAGTA GCGCATGTTT 50 CCTCCAGGTA TGGAAATGCTCTGTGAGGCG GTAAGTCGAG CCCACGTACG 100 GCCCCTGCTC CTTCTTACCC ATGCGCAGCATCTTCTTCAT ACAGACGCGC 150 CGCCGGGTTC GAGACCACAT TCGGGTGCAG CGGGTTAGTGCCCAGCGGCG 200 TTTCATCGCT CGTAGTGTCA GGAACGCCTT CGCATTATCA TAGCAAACGA250 ACGTTCCAGC CCTTTCGCGT CATGAAAGAT GCGTCCGG 288 254 base pairs nucleicacid single linear NO YES DNA (other) 31 GATCAATAAC CGCATCGTTGTAGAAGTTCC CCTGCAATTT CANNNNATCC 50 AGATAGTTGT TCTGGCTCAG GCCGACGGAAGAGAAGCCAC GGATAATCAC 100 GAAGTCATAG GTATTGGAAG CGCCGCGCTG CTTACCGTTACACCCGCGTG 150 TAACCCAACG CTTCTTTACT GACTGGAATT GATGCATCTG CATCTCTTCG200 TTAGTGACCA CCGAAACCGA CTGTGCGTTT TTCGATAGTA TCAGTTTGTG 250 TGCG 254176 base pairs nucleic acid single linear NO YES DNA (other) 32GATCTTGTTG GCTCGCCTCT CCCCTCGGAC AACACGGTAT AAAACGCGGT 50 GATAGAGCCACCGCCGTGGA TGCCATTACC GGCACGCTCG ACCAGCGCCG 100 GCAGCTTTGC GAACACCGAGGGCGGATAAC CTTTGGTGGC TGGCGGTCGC 150 GATTGCCAGC GCATTAGTGC ATTGAT 176338 base pairs nucleic acid single linear NO YES DNA (other) 33GATCGTGATA TTCAATGCAC GCCTGCAGCG TGTTTTCGAT AAGCGTGGCG 50 ACCGTCATCGGGCCGACGCC GCCCGGTACT GGCGTGATGT ATGACGCGCG 100 CGCCCGGGCT TCGTCAAACACGACGACGCC AACGACCTTG CCATTTTCCA 150 GACGGTTAAT ACCGACATCA ATCACAATTGCGCCTTCTTT AATCCATTCG 200 CCGGGAATAA AGCCCGGTTT ACCTACGGCG ACAATGAGCAAATCAGCATG 250 CTCGACATGG TGACGCAGAT CTTTGGTAAA GCGTGCGTAA CGGTAGTCGT300 ACAGCCAGCC AGCAACAGTC ATGCTCATTG GGCTCAAC 338 319 base pairs nucleicacid single linear NO YES DNA (other) 34 GATCTTGCAG CGCGCCGTGCCAGGCATAGC GCACCTGCTC ATTAAAGACG 50 TTCGTTTTAC GTGAGTTCGG TTTCGGCGTCGGCTTCTGGC GTGCTGGCGC 100 GTTGCCGCCG CCTGTTCCGC GCGAGACTTA CGCAGTCGATCCAGCCGTGC 150 GCGAACTGCC TGATTTGGTT AATCGCGTGG GCCTATTCAT TGGCCAGGCC200 ACCATGCAGA TGTCCATCGT CAGGACGAGC TGCCTATAGG AACGACGGGA 250CATAAGTCCA ATATGTGCGA GCGTCAGTAC CGTACCCTAA GTAAACTCTT 300 CAACAGAAGTAAATGCCTT 319 418 base pairs nucleic acid single linear NO YES DNA(other) 35 GATCGATTTG CGCTGGCAGG TTGCTGCCGG TATTGACCTC TTTGTACATA 50TTCAGCGGCG CGTTCTGCGA GTAGCGCAGG TTATCTTCGA TATAGGTATT 100 AAACACGCCTTTGGAGAGCG CGGCTTCATC ACCGCCGCCC GTCCAGACGC 150 GTTGGCCTTT TTTACCCATGATAATCGCCG TGCCGGTATC CTGGCAGGTC 200 GGCAGAATGC TTTGGGCGAT CTGCAGGTGGCACTTTTCGG GGAAATGTGC 250 GCGGAACCCC TATTTGTTTA TTTTTCTAAA TACATTCAAATATGTATACG 300 CTCATGAGAC AATAACCCTG ATAAATGCTT CAATAATATT GAAAAGGAAG350 AGTATGAGTA TCACATTCGG GCTATCTTTG GATTCTCGTT GACACAGAAC 400GAGGAAGAAG CGAGACAT 418 350 base pairs nucleic acid single linear NO YESDNA (other) 36 GATCAAGAGT CAGGGGTAAT TTTACCTTTT GCATAGGGCG CGCATATTAA 50CTTCGTAACG TCATATAGTC AAAGAAAAAG GCAGCCTGCG GTTGCCTTTT 100 GCCAATAATTCGCACACATT GCGGGTTACA GACTTATTTT CGCTCAAGAC 150 GAGTCAGTAT GACAGGCTTGAAGACCGAAG AGCTATGTTT AAGATGGCTC 200 TCATCATTAC GCTATATCTG AGGGAAAAAATATGCCCCGT CTCATCCTTG 250 CGTCTACCTC TCCCTGGGCG TCGCGCGCTG CTGGAAAAGCTGACGATGCC 300 TTCCGATGCG CGCGCGATGT GATGAACCCA TGCCGGGCAC GCGCTCAGTG350 270 base pairs nucleic acid single linear NO YES DNA (other) 37TGCGACAACA CACCCGCCAA AGCCGCCGCC GGTCATGCGC ACGGCGCCTC 50 GATCGCCGATGGTCGCTTTG ACGATGTCTA CCAGCGTGTC TATCTGCGGG 100 ACGGTAATTT CGAAATCATCGCGCATTGAG GCATGGGACT CCGCCATCAG 150 TTGGCCCATA CTTCGAAATC ACCTTTCTCCAGCAGGCTTG CCGCTTCAAC 200 GGCGGGCATT TTCGGTCAAT ACATGGCGAA CCGTTTTCGGATACCGGGAC 250 AGTTCCGTGG CAACGGCATT 270 280 base pairs nucleic acidsingle linear NO YES DNA (other) 38 GATCCAGTGC TTTCGCCGCG TCATCCACAATGACGTCAAA GCCAAAGGTT 50 TCGGCGCGAG TACGCACGAC GTCCAGAGTT TGCGGATGGACATCAGAGGC 100 GACAAAGAAC CGGTTGGCAT TTTTCAGTTT GCTGACGGCT TTGCCATCGC150 CATCGCTTCA GCGGCGGCGT CGCTTCATCC AGCAGCGAGG CGAACGATGT 200CCAGCCCTGT AGTACAGCGT ACTGTTGAGT TACAGACTCA AACTAAATCG 250 TATAGATTTAGCCTACACTG ATTTACATTA 280 275 base pairs nucleic acid single linear NOYES DNA (other) 39 GATCATCGCC TTCAAATTGA CCTGCTTGAG ATCGAAAATGAGCTGCGCTA 50 AGTCCTCGAT AGAGTAGATA GCGTGGTGCG GTGGCGGGGA GATCAGCGTC 100ACGCCCGGCA CTGAATACGC GAGTTTAGCG ATATACGGAG TGACTTTATC 150 CCCCGGCAACTGACCGCCTT CGCCGTTCGC CTCACTTTAA TCTGAATCAC 200 ATCGGCATGA CAGTAGGTCGGTCACAAGCG CGACGACTCT ATCGCAATAT 250 GTCAATCCGG TCCTACATAT CATTT 275 333base pairs nucleic acid single linear NO YES DNA (other) 40 GATCTTTCGACTCGATGTTG GCGACGAAGA TAAAGTTCGG CAGCAGCTTG 50 CCCGCGTTGT CATAAACCGGGAAATACTTC TGGTCGCCCT TCATGGTGTA 100 CACCAGCGCT TCGGCAGGCA CGGCGAGGAATTTCTCTTCG AATTTCGCCG 150 TCAATACCAC CGGCCATTCC ACCAGCGAAG CTACTTCTTCCAGCAGGCTT 200 TCGCTCAGGT CGGCATTACC GCCAATATTA CGTGCTGCTC TCAGCGTCCG250 TTTGATTTGG CTTAGGCTCG TAGTCGCATG ACTTACGGAC TCAGAGAATT 300GCGGTACTGT CAGATGTGAG GACCGTACAT AAG 333 233 base pairs nucleic acidsingle linear NO YES DNA (other) 41 GATCGGGCAT CGGCACGACA CCGGTATTCGGTTCGATAGT GCAGAACGGA 50 AAGTTTGCCG CTTCAATACC GGCTTTTGTC AGCGCGTTGAACAGGGTGGA 100 TTTCCCGACG TTGGGCAGAC CGACGATACC GCATTTGAAT CCCATGATTT150 AACTCACCTT AATATCTTAA TAATCAACCT GTTATAGAAA ACAGATTGCA 200GAATGGAATA CTCGCTATTA TCACGCGCGC AAA 233 302 base pairs nucleic acidsingle linear NO YES DNA (other) 42 GATCAAGCGT GTCCGGCGAA AACGTTACGCGTTCTCGCAG CGATACAGGT 50 GCCGTTTTAT GGTTAATACC GAGCGCTAAA AGGGTCATGTCTGCGGGAGT 100 AGTACCAGCG TTGATATGGT TAGTCTGCTT GCATCATACA GGATGCGCGT150 GGTCAATAAA AGAGAGAGCC CCCTTTTGGA GTAATTGGCA GCGCTCGCTA 200ATTTGATGAT TTAAGACACT TGAAAGTAGA CGATGTCACC AGGCGCCTAC 250 ATTAAAGGCTATACTGTACG ATAGCAAAAT TTCCGATCCG CCACTTTCAC 300 TC 302 262 base pairsnucleic acid single linear NO YES DNA (other) 43 GATCTACTTT CGGGATGGCAGCGTATCTGC CGCAATACAC CCTGATGGAT 50 GTTATGCCTG GATCTGATTA CTCTTCTTTGGGCGAAGTTT TCGACCCGGC 100 TCTTTAACTT CTGCCCGGGT CTGAAGGTCA CCACGCGCCGTGCTGTAATA 150 GGAATATCTT CACCCGTTTT CGGTTACGCC CCGGACGTTG ATTTTTATCA200 CGCAGATCGA AGTTACCAAA ACCAGAGAGT TCACCTGCTC ACGTTTCAGA 250GCACGACGAT CT 262 153 base pairs nucleic acid single linear NO YES DNA(other) 44 GATCAGGTCC ATATTTGTCT TTGCCTTTCT ACCCGACACG TTTCGGGTGT 50GCGATTCGGA TTAGTCCGCC AGAAATAGCG GGCCCATTGG CGGTTTTGGA 100 AGGTCAAAAAGGTCAGGGTA ATCCACCGCA ACCAAATATA GCCCTTCCGC 150 CTT 153 169 base pairsnucleic acid single linear NO YES DNA (other) 45 GGCGCGTTGG CAGATTTTGCCAGACGACGG GCGATTTCGG TTTTACCGAC 50 GCCGGTCGGC CAATCATCAG AATATTTTTCGGCGTTACTT CGTGGCGCNN 100 CTTCATCAAG CTGCATACAC GCACGTTACN ATCNNGACGGAACCTTTGTA 150 TCTGCGATAA TNNTTGTAG 169 282 base pairs nucleic acidsingle linear NO YES DNA (other) 46 GATCGCTGTA GATTTTACAA GTCTTCTTCAGCGATACACG TCTGCACAGC 50 AGGCCGAAAC CGGTGTTGAT GCCGTAGGAG TACGCCTTCAGGCAACGATA 100 TCATTGACAA CGCGACGTGG CGTTAATACG TCAATGGCAT GGCCTTCCAG150 CGAAAGCTGT ACGATGAGAT ATGACATGAG AGAGACTTAA CTGCCCCAGA 200GTATATATTG TGTTCATATC AGCCTTTCCT CAACAACCAT CGTAAATTCA 250 GACTTACTCACACACATTCA CGTAGATCAT TC 282 258 base pairs nucleic acid single linearNO YES DNA (other) 47 GATCGCGGGT CAGTGTACGC ACCGCTTCCG GCGTATTTTTCCCGCTATTA 50 AAATAGAGCT TGTCGCCAAC AATCAGGTTA TCGAGATTAA TGACCAGCAG 100CGTATTTTTC TTCTCAGCGT CACTCATCGT TTGAGTAAAT TTGGGGGCCT 150 AGCTTTCCCTCTTCTTCCCC GCTGGTGGCG ATAAAACGAA TCCCGTAATG 200 GGTCGGTATA TCTTTCAGACGGCGCAGTTC CAGCATAAGC CCTAATCCCG 250 CGGCATTA 258 315 base pairs nucleicacid single linear NO YES DNA (other) 48 GATCGCGACA TGCGCAACATCTACCAGTTT ACTTAACTGA CTAAACAGTA 50 AGTCGACCGA CCGGGGACTG GCAACGGTCAATTCAATATT TATATTCTGC 100 GCATCGGTCG CGGCTTCCAT ATTCAATGGA GCACACCTGAAAACCACGAT 150 GGCGCACCAC GCGTAAAACA CGTTCTAAGG TTTCTGGATT ATAGCGTGCC200 GATACATTGA CCTGATGTTG CATCATGATA TTTCACGATT TCAGAGTCAT 250GGCGCAGGCG CACACGCAGA CATTTGAAGT CTCGATGAGA CGAGAGACGC 300 CTCAGTCACTGTCGA 315 268 base pairs nucleic acid single linear NO YES DNA (other)49 GATCCAACGT CTGGCGTAAT GCCAGCATGT CGTACTGGGT GTTGTTGCCC 50 AGCTCCGCACGTGGGTCGCC TTTCGCCACC ACGTTGAACG CCAGACCATC 100 TTTAATTTGC GGCGTCGGCCAGCATGGTAA AGCGGTTGCT GAGTACACGC 150 GCTTCACGGA ATACCGTGGT GGCTTGAGCACCGCTCACCT GCTTGAGTCG 200 GCTGTTCAAC TCGGCGTAGT CCCCACATTA AGGCTGGTTGTACACGTCGT 250 TGTTGGTGTA ACCGCGGT 268 296 base pairs nucleic acidsingle linear NO YES DNA (other) 50 GATCTAAAAT TCAAATACAG GAACAGGGAGTTCTGGTGCA GAGGGTACTA 50 TGTCGATACG GTGGGTAAGA ACACGGCGAA GATGCAGGACTACATAAAGC 100 ACCAGCTTGA AGAGGATAAA ATGGGTGAGC AATTATCGAT CCCGTATCCG150 GGCAGCCCGT TTACGGCGTA AGTAACGAAG TTTGATCGAA ATGTCAGATC 200GTATGCGCTG TTAGGCGGCT GGTAGAGAGC CTTATACCAT CTGAAAACTC 250 CGTATCCGAGATATTATAGA CTATTGGCAA CCTGAATCTC TCGATT 296 213 base pairs nucleic acidsingle linear NO YES DNA (other) 51 GTACACAGAC GCCTTTCAGA TTGGCGATGACGCATCCATT GAGAACACCC 50 CATCGGTGGC GATCAGGACA TGACGCGCGC CGGCCTCACGCGCCTCTTTC 100 AGCCGCGCTT CCAGCTCTGC CATATCGTTG TTGGCATACG CTTCGCTTTA150 CACAAACGCA CGCGTCAATG ATAGACTGGT TCAGCGCGTC GGAATATAGC 200GTTCGCGCAG CAA 213 113 base pairs nucleic acid single linear NO YES DNA(other) 52 GATCGAAACT CGCCACGTTA ATCACCGTCG CCACCACCGG CGGCCAGCGT 50CCGTAAAGCA GCGCAATCAC CACTACGGCC CAGGCAAATC GATGCATTAC 100 CAGATTGGCGGCG 113 337 base pairs nucleic acid single linear NO YES DNA (other) 53GATCTTCCGG GTTAAATTGC AACAATGCTT CGCTAACGCG CAGCCAGCTC 50 CATTTGCGGTTCCTCCATCA GCGAGGATTT CAGCGTATCC AGTAGCTTAC 100 GAATCACTTC GGCGTTATCCGCTTCGTCCA AATCTTCATT AAACAACTCG 150 GCGACCGGAC TAATATTGCC TTTTAACCAGACTTCCAGAG TATGTTCATC 200 AAGCGTTTTC ACCGTTCGAA CGGTTAATCA GCCACATTTCCCCTTTCCAG 250 CGATTCAATA CGCAAATCAA CTGCGTTGGG AAGATAACCT AGGCACAACG300 GCAAATCAAG ACGTTGCATA CATATAAATA GCGCCAC 337 313 base pairs nucleicacid single linear NO YES DNA (other) 54 GATCATAAAA CTTCCGCGTGTATATGTTGG TTGGAACCGT AGAGATATAG 50 ACAGGTGGTT CTACACAGGC GTTTACCCCTACCGTCGCAA ACATTTCTTT 100 AATCAGGCTT TCTCTTTTTT CTTCTGATGG ATGCGAGTGATTAAACTCAT 150 ACATTAACGT TTTCCCACGA AGTCTTTTTT CCGGTAAGCC TTCGCATATA200 TCGGTAAATA GCTTGCCTGC TCTTATCTTT CGGTCATGGC ATGTTCATCG 250CGATCACTCC GTTATGATAT GTCTCGATAG CCTCGATCCA ATGATGCTAC 300 GCATCATCACTCA 313 300 base pairs nucleic acid single linear NO YES DNA (other) 55GATCGAATTC AGATTCCATT ATCGCCATCA GATATTCCAG ACGTTCAGAT 50 TAACGTCGGACATCTCCAGT ACGGACTGTT TATCCGCCAG TTTCAGCGGC 100 ATATGCGCGG CGATGGTGTCAGCCAGACGT GCAGGGTCGT CAATGCTATT 150 GAGTGACGTC AGCACTTCCG GCGGAATTTTTTTGTTCAGC TTGATGTAGC 200 CTTCGAACTG GCTGATAGCG GTACGACCAG CACTTCTTGTTCACGCTCAT 250 CAATGGCTGG CGAATAAGGT ACTCGCTTCG CGAGAAATGT CGCGTGCAGA300 423 base pairs nucleic acid single linear NO YES DNA (other) 56GATCCCACTT CTTGAACTGC TCGAAGCAAA CGCCTTCCGG CAGATCATCG 50 CGCGCCACATACAGCTGAAT GCGGCCGCCT ACGTCTTGCA GGGTAACAAA 100 AGAGGCTTTA CCCATAATACGGCGCGTCAT CATACGGCCC GCGACGGACA 150 CTTCAATATT CAGCGCTTCC AGTTCTTCAGCTTCTTTCGC GTCAAACTCT 200 GCGTGCAGTT GGTCTGAGGT ACGGTCAGAC GGAAATCGTTGGAACGGATA 250 CCTGCTCACG CAGTCAGCCA GCTTTGCACG TGCCTTATTT ATTGTTAAGA300 TCGACTACTG TACGCCTGTC TTTGTCAGAC ATGTGATCTC ATAGCCTGGC 350TTTCAAACTT GCTCGATATG ATCAGACTAC GTCAGTACGC TGGATGCGTC 400 ACAGTACAGCTTAATCGATC AGA 423 173 base pairs nucleic acid single linear NO YES DNA(other) 57 ACAGAATCTT TTTCACGACG TTCTCGTTAA TAACCGATAA GACGTGAGGA 50GTTTAGCAGA TTTAGTGCTT GATTTCGTGG CTTGTTTACA GTCAAAGAAG 100 CCGGAGCAAAAGCCCCGGCA TCGGCAGGAA CNCTTATTTA TTAATAAAAT 150 CTTCCCCAAC TAATATCTTTTTT 173 218 base pairs nucleic acid single linear NO YES DNA (other) 58GATCCTCCGT GGCATAAGAA ATGCCGCCAA GAATCGTGAG TAAGATGTTG 50 AAAGGATTGCGATAACATAC CCACAGATGC ACCCACCACG GCGAGGGTTT 100 CTGTGCCGGA ACGGTTTTCGCCATGCTTTT CACGCGCNNT CACCTCGGCA 150 GCGTTTAATC CTCGGTGCGT ATCAAAACCTGCAGAGAGTC TCTGCTCATG 200 CGCGACTTCA GACAGTAG 218 346 base pairs nucleicacid single linear NO YES DNA (other) 59 GATCGAGAAA AGTGAGCATCCCTTCGATGG TAAGTTCGGT CTCATCCTCC 50 ACACTTAATG TCGGATTGTT CCCGGAACCATCCAGCTTAC GTGTCGCTAT 100 CAGCAATACT CGGAATCCCT GCGCATTGTA ATCTTCGGTTTTCGCCAGCA 150 GTAGCTCGCG GCGTGTTTCC GTCAAGCGCC ACCACACGAT CGCCTTCGCG200 AAGATGGGTG GCTACCATCA TCATCTCTTC AACGGCGCTT TGCAGATCAG 250GCATCTGTCT CATGCTGCGC ATCTCACAGA CGATACCGCG ACGTACAAGT 300 CGATGCAGTCATCGTTATGA GCCCTTGCGA TGTGCATGAC TGCAAC 346 323 base pairs nucleic acidsingle linear NO YES DNA (other) 60 GATCCTGACG AATGGCCACA ACGGAAGGCTCATTCAATAC GATGCCTTGT 50 CCTTTTACAT AAATGAGGGT ATTCGCGGTA CCCAGGTCAATGGACAGGTC 100 ATTGGAAAAC ATGCCACGAA ATTTTTTCGA ACATACTAAG GGATTAATTC150 CTTGAAAGCT GGGGCGAAAA CAAAATGCGT TTACTTTACC AACCACACGC 200AGCAGCGACA AGCGCGAAAA TCATCTGCTA CGTGAATTAG TGCGTCGTTC 250 TTTGTACAATCTCGCTGAGT CAGCTGAAAA TCACGCGATC TGCTCGTGAC 300 TTGAAGATCT CGATTCTCGACAT 323 276 base pairs nucleic acid single linear NO YES DNA (other) 61GATCGCGCGT GGTTTGCAGC GTCGGTTCCA CCACCAGTTG GTTAATGCGG 50 TTCGTTTCCAGACCACCAAT CTCTTTCATA AAATCTGGCG CTTTGATACC 100 CGCCGCCCAC ACCATCCAGATCGGCCTGAA TATATTCACC TTCTTTCGTA 150 TGCAGACCGC CTTCGGCGGC GCTGGTGACCATAGTTTGCG TCAGCGCGAA 200 CGCCAGTTTG GTCAGTTCAT TATGCGCGGC GTGGAGATACGCGCGCACGA 250 GGCAGATACG CGCAGTCACA CGAGTC 276 166 base pairs nucleicacid single linear NO YES DNA (other) 62 GGGCCAGAGG TATGACTCCACCAGACCGTC AAAGACGGCG TTGCGTCGTG 50 CTCAGCATAG AAGCCGCGCG CCTGCTCAACGGTCAGGTGC AGCATTATTA 100 GTGCCCAACA ATTTTGAACC CTGCAGCTTC AAACGCGCGAAAGATCGTCC 150 AATACGTTCT CCGACC 166 425 base pairs nucleic acid singlelinear NO YES DNA (other) 63 GATCTTTAGC CGGGCAGACC TCTACGCATA AATTACAGCCAGTACAGTCT 50 TCCGGCGCGA CCTGCAGCAC ATATTTCTGG CCGCGCATAT CGCGGACTTC 100ACGTCCAGCG AATGCAGACT GGCTGGCGCG TTCTCCATCG CCTGCGGGGA 150 AACGACTTTCGCACGAATTG CCGAGTGAGG GCAGGCAGCG ACGCAGTGAT 200 TACATTGTGT ACACAGTTCCTCTTTCCAGA CAGGAATCTC TTCGGCGATA 250 TTGCGTTTTT CCCAGCGGTG GTGCCCATTGGCCATGTTCC GTCGGCGGCA 300 GGGCGGAAAC AGGCAGTGCG TGCCGAGGCC CGCCAACATGGGCCGTAACG 350 TTTCAGAAAT CGCAGTGAGA CGGCGGCATC CCATAGGATT ACGCTGAGAT400 CCAGATCTCC AACATCTCAT CTAAA 425 333 base pairs nucleic acid singlelinear NO YES DNA (other) 64 GATCTACCGG GTGAGCGTAT AACCNATCTT AATCCCTCCCGGTTAGGTTG 50 ACATTAGGAT CCTGTTCCTT TCGGGTTATA CTGCGCTGAA CGCGGGTCCA 100GTCCAACGTG AATACGGCAG ATAAACCAGA CCAGCCAGTA ACACAAAAAT 150 AAAAATTCGCAGCTTCCACA AAGCCAACCC AGCCGCTTTC GCGATAGAAG 200 TCGACCATGC GAACAGATACAGCGCTTCAA CGTCGAAGAT AACGAAGAAC 250 ATGGCTACCA GGTAAAATTC GGAGACAGGCGTAAGGCGCG CCGGTGCGAC 300 CATTCATCTC CATCCTTTGA ATTACGGACA GCA 333 374base pairs nucleic acid single linear NO YES DNA (other) 65 TTATCAATACCCGCATTTTT ACTGAAACCG GGCGTGATGT TTTTGGCTTT 50 GACATTGCGA ATGACGAAATGTTTGCCATT TTCTACGTGC ACAAGCTGTC 100 GGCAATCAGA TCCGGTAATA TTGGCCACCACAAAGTTTTT TACTGCCTGG 150 TCTTCAGGAT AACTGTTGTC ATAGGTGCTA CCCGCCAGCCCGATCCCCCA 200 GTTGATTTTG CCATTGGTAC AATTAATGCG TTCGATGACA TGATCGGAAA250 TCAGGATGTC GCGGTCGTGA TCGCGACATT CCACTCATGG CGTCCCCTGT 300AATCGCTAAG CGCTATCGTA ATCGCGCGCA TCCATTGTTA TGAATCCTGC 350 GAGATGGCGAGTGCGTGGTA CGGA 374 296 base pairs nucleic acid single linear NO YES DNA(other) 66 GATCCTGAAA TGCCCATCCA CGCCAGCTTG GGTATAGAGC AATCTGGCAG 50TATAAGATTT GGGATGTATT TTGGCCGCAG CCGCAAAAAA CGCGTCTGGG 100 CGATTCGGACAACCAGAAAG AGGCGCTCTG TAATGCGGTC TGGGCTATGG 150 GACGAATTTC CAGATAATAGTAAACGATTA ACCCTACACG AAAGCGTAAC 200 AGAAGCGCAT AACGCCTTTA AAAACCACAGTAACACGCCT GCATTATAGT 250 TTTTCTTACT CAACATCTAT CGTTCGCATA CCGGATGTAATAGGCT 296 178 base pairs nucleic acid single linear NO YES DNA (other)67 GATCGGCAAA GGTACCGGTG GTGCCGTCGT AGTTTTCTCC GCGCCGGGCG 50 TTAACGTTCTGGCCCAGCAG GTTGACCTCA CGCGCGCCCT GGCCGCTAAC 100 TGGGCGATTT CGAACCGGATCATCGTCTCA GGGCCGGCTG ACTTCTTCGC 150 CGCGGGTATA CGGCGCACAC GTAAGTAC 178327 base pairs nucleic acid single linear NO YES DNA (other) 68GATCAAAAGT TTTCTGCGCC GCCTCGTTCA TCAGTTTATA AGGATTGCTC 50 TGATCCGCTGCCGTTGCTGC GCTTAATGGC GCAATGACCA GCAGGGCCAC 100 CATCATCAGT CGTTTAAACATGCCTCAATT CTCCTGAGAT TATTTCGTTT 150 CGCCCGCGGG CTTGTGGCTT CAGTATGACCTTCCGTTGCG GGCTGGCGCA 200 TCGCAGAATT CTTATTGTCG TCGCCTTCGT GTTATAAGGAACTGCCAATC 250 ATATCTCCAG CACATGCAGA CGGTCTGATC GTACTGCACG CTAGATAGAC300 GTCAGACTCA ACACAACGAG CTAGCGA 327 375 base pairs nucleic acid singlelinear NO YES DNA (other) 69 GATCCAGCAG GTTGATTTTT GTTTCTTTGT TAGGAACTACCGGGGTACTG 50 CTTTCAGGTG TGACAATTTG TTCAGACATA TGCTATTCCG GCCACGTTAT 100TACACGTTAT GGCCCCTGGA GGTTGAAAAA AGAAACGCCC CGGTAAGCTT 150 ACTGCTCGTCCGGGGGCGCT GCATTGTACA AATTCTGGCG TAAGGAGTCC 200 ACGTCTGCAC GCGCATTAGCAAAAATAATA TTTGAACCGA TAATTTATCG 250 CCAACGCATT TACAGCGTGA AAGACGAAGGAGATTAACGG GTGGGGGCCA 300 CTCGCTTCAC GAGAAAAGCG ATTCGGCTGG CGATTCAGCGAATCGACGTG 350 TGCGTTCAGT ACTATCACGT AGTCG 375 298 base pairs nucleicacid single linear NO YES DNA (other) 70 GATCGGACGG CGCCTTATCTTCTTCAATAT CGCGCGTACC GTAGAAACCT 50 TCAGGCAAGG TCGCTCAGCG ACAGCCTGCTGGCTGAGTCC GAGTTGTTCA 100 CGGGCATTGC GCAGACGAAC GCCGGTGGTT TGTGCTTCATTTTGGTCGTG 150 CGTTGCTTCA GTATTCATTC GCTACAGCTA ACGGTACGTG TAAATTAGGA200 TTCAGGCGCC GACGAGCGTA ATGCCGCCAC GCGCAAACAT CGTAGTACTT 250AGTCAGACAG TATACGTTAG CGCGCGATAC AGCTAGAACG CTAACTGT 298 234 base pairsnucleic acid single linear NO YES DNA (other) 71 GATCTCACCT TTTTTTAGCTGCGGCATCGC TTCCAGAGTG GCGACCGCCG 50 GGTACGGGCA AGGTTCGCCA ACCATATCCAGACGGTAATC AGGGACGATA 100 TTTTTCATAC AGATTCCTTA GCAGGCGTCA GCCCGCACGGCGAAAAAACG 150 TTTTTTTCCC AGCCGATGAT TAACATTCAG TGGTAAATAA CAACAAAGTA200 GGTGACACGC AGACCGTAGG ACCAAGTATT CAGC 234 317 base pairs nucleicacid single linear NO YES DNA (other) 72 AGCTCTGATT TCGGTAGCGATACGTCATCC ATCAGATTCG CCAGCGGATG 50 GACAAACGGC AGGATGACCA GGCTGCCGATCAATTTGAAC AATAGGCTGC 100 CGAGCGCTAC CGGACGCGCG GCAGCATTGG CGGCGCTGTTATTGAGCATC 150 GCCAGCAGCC CCGATCCCCA GATTGGCGCC GATGACCAGG CACAACGCCA200 CCGGGAACGA TATAATCCCG CCGCCGTCAG GTCGCCGTCA GCAACACCGC 250CGCCACTGGG AATAACTGAT AATAGCGAAC ATCCGGCCAA TAGCGCATCA 300 GCATATGTGCCTGAGAG 317 134 base pairs nucleic acid single linear NO YES DNA (other)73 GATCGAGGGC ACAGGAGAAA CGGGCATTTT CGCCGCAATT AGTTGACCTG 50 ATCTCCCAAGACCAAATTTT CCTCAGCCGG AATATACCAG AACTGGTCGC 100 GATATCCGCA AGATCGCGCTTCACGGCGTC GCTT 134 387 base pairs nucleic acid single linear NO YES DNA(other) 74 GATCGTAATG TGCGGCCAGT TCAAAACCGA AGCGGCTATA TAACGCCGGA 50TCGCCCAGCG TCACGACCGC CGCGTAGCGA ACTCGTTGAG CGAATCCAGC 100 CCTTCATACACTAACTGGCG CGCCAGCCCT TGCCCGCGAT ACTTTTCATC 150 GACCGCCAGC GCCATGCCGACCCACTGTAA ATCTTCGCCT GCACATCAAC 200 CGGGCTAAAG GCGACATAGC CACACTGACCTTCATCATCG TGCACAGTCG 250 AGGTAGAAAA CATCTCACGA AATCGTGAAC AGCTTGCTTCGCATGTTTCG 300 ATGACGGCGT ACACGCGATC AATACAGCGC ATCATAGATT TATGATAGAT350 GTATAGAGTG TGTCTAGAGT TTATCGCTAC ATCGAGT 387 189 base pairs nucleicacid single linear NO YES DNA (other) 75 GATCGTAAGG ATTGACGATTAACGCCGACG TCAGTTCATT CGCCGCTCCG 50 CAAACTGTGA CAGTACCAGT ACTCCAGGGTTAGCGGGGTC CTGCGCGGCG 100 ACAAACTGTT TGTGGACCAG GTTCATCCCG TCACTCAACGGGTTACTAGC 150 CCGACGTCTG AATAACGGAA TATACTTCAT TAACAGTTT 189 217 basepairs nucleic acid single linear NO YES DNA (other) 76 GATCACGAATATTCATTATT CATCCTCCGT CGCCACGATA GTTCATGGCG 50 ATAGGTAGCA TAGCAATGAACTGATTATCC CTATCAACCT TTCTGATTAA 100 TAATACATCA CAGAAGCGGA GCGGTTTCTCGTTTAACCCT TGAAGACACC 150 GCCCGTTCAG AGGGTATCTC TCGAACCCGA AATACTAAGCCAACCGTGAC 200 TTTGCGACTT GGTTTTT 217 275 base pairs nucleic acid singlelinear NO YES DNA (other) 77 GATCCCTTCT TTTGCTGATG CAGTAGCGGA CCAGGCTACCACAAGGGGAA 50 TGATGCAGAC TGCGAAAAAG TTTTTCATTT CAGAACCTGC CTTAATATTG 100GGCTAAAAGA CAAGTTTCAC GGTATAGGGT ATGATATAAC GATTCAATAA 150 ACGAAGCCCAAAAAACGGTC TATTGTAACG CTGGGTTTCT GTAAGCGGGT 200 AAAATGAGAT GAGATTTAATAACATCAGAT ATCTCGGATG AATCACTCTC 250 GAATCCGCAG CGTCCATCTA CGTAT 275 101base pairs nucleic acid single linear NO YES DNA (other) 78 GATCTTCATACAGGCCCAGA TAGCCGTCAT AAATGCCCAT GACTTCCAGC 50 CCTTACGTCA ACGCTGCAACACAACACCGC GGATTTTTGA TTCATTCTCT 100 T 101 303 base pairs nucleic acidsingle linear NO YES DNA (other) 79 GATCCGCACG GATAAAAACT CGTTTCCCGGCCAGATCCAG ATCGGTCATC 50 TTAATTACAG ACATGGTGAA TCCTCTCAAT GATGCTTAAAGTTTTGTCGA 100 CGCTGACGCG TGAGCCTGAA ACCAACTGCG GCCATCGCTA ACGTGGTGTC150 GAGCATCCTG TTAGCAAAGC CCCATTCATT ATCGCACCAG ACCTAGCGTC 200TTGATCAGTG GGCGCACTGA CCGGGTTGGG CATCACATGG CGTGGCTGGT 250 AATTTGGACGGTGCATGTAC TCATGATGGC TTGGTTGGCC GGATTGCTTG 300 CTT 303 257 base pairsnucleic acid single linear NO YES DNA (other) 80 GATCGTGACC CGGATAACGCTCATCATCTT TGGTCAGTTC CGGCGGCGTC 50 ACGGCAAAAC CGCGGCGCCA CTGTTTAACCTGCTCGTCAC CATATTTTTC 100 TGCCGTTTGC GCTTTATTCA GCCCCTGCAA CGGCCATAGTGACGTTCATT 150 GAGTTTCCAG GATTTTTTCA CCGGCAGCCA CGCTGATCCA GTTCATCCAG200 TACGTTCACA GGCTATGGAT AGCGCGTTTC AAGTACGGAA GGTAGGCAAA 250 TCAAGCG257 290 base pairs nucleic acid single linear NO YES DNA (other) 81GATCGAGCAG GCATTGCAGC AGCAGACTTT TGCCCTCCCC GCTGCCGCCA 50 ACCAATGCCACCATTTCGCC GGGCGCGATA TCAAAAGAGA CATTCTGTAA 100 TAACGGCGAC CAGCGTCTCGCGCCATACCA GCGATAACGG CGCTTTCCAG 150 CGTAACCTGT TGTAAACTCA GATACGTCACTCCTTAGCAC AGCCGCTGAA 200 TGGCGGAAAC TGTCGAAGAG CATCACAGCG TGAATAACATTAGGCCGGGA 250 ATAGACAGCA CAGTTCATGG CTAATAACGT ACCGTCGAGA 290 233 basepairs nucleic acid single linear NO YES DNA (other) 82 TGCAGATCCACCTGGAACGG CGGGATGTTG ATCACCTGGG AGGCCAGACC 50 GCTATTACGG CGCATTAACGCGCCATTACC TCTTCGATGT GGAATGGCTT 100 CGTCACGTAG TCATCGGCCC GGAGCTGAGAACCTCGACTT TATCCTGCCA 150 GCCTTCGCGC GCGTTAACAC CAGAACCGGC AGTGAAACATCACTCGTGCG 200 CCCACGGGTA TTAAGGAAAG GCCGTCTTCA TCC 233 284 base pairsnucleic acid single linear NO YES DNA (other) 83 GATCTCATCA AAACGGTTGAGTACCAGCGC CAGGGTCATA CCCGCCTGGT 50 TCAACGCCGT CAGGTGCGCC AGTTGTTGACGGGCGGTCAC GTCAAGCCCG 100 TCGAACGGTT CATCAAGGAT CAATAACTCT GGCTCAGACATCAGCACCTG 150 ACACAGCAGC GCTTTTCGCG TCTCGCCGGT AGAAAGGTAT TTAAAACGCC200 TGTCGAGTAA AGCGGAAATC CGCGAACTGC TGCGCCAGTA TCGCACAGCG 250CAGGATGGTG ACATATCCTG AATATTCGCG TAGT 284 367 base pairs nucleic acidsingle linear NO YES DNA (other) 84 GTTGCGATTA TCCCGCAGCG CCTGCTCGAACAATTGGATT TGCTCAGTGC 50 TTTCATGCCA TAACCAGAAG GTACTGATTA ACTGGAACACCAGCAGAATA 100 AGACCAATTG TCAGCATTAA ACGCTGGCGA AGGGTCACTG CTCTTCGCTG150 AAAACGCATC AGGCTCACTT AGCTTTCCTC AGTGGCAACC AGCATGTAGC 200CAAACCCGCG AACCGTGCGA ATGCGACTTG CCGACTTTGT CGCGCAAATT 250 ATGTATAGCACTTCCAGAGT GTTGGTCGAG GGTTCGTTAT CCCAGTTGTG 300 ATATCGTTAT AAAGAATTTCCGGTGCACGA CTGCCTGAGA CTAACCGTGA 350 GAGCACGTAT CTAGCTC 367 320 basepairs nucleic acid single linear NO YES DNA (other) 85 GATCGTTGATCGCCTGGATA ACAACCTGCT GCTGCTCGTG ACCGAATACC 50 ACCGCGCCCA GCATAGTGTCTTCGCTCAGC AGTTCAGCTT CGGATTCCAC 100 CATCAGCACA GCCGCTTCGG TACCGGCAACCACCAGGGTC CAGCTTGCTT 150 CTTTCAGCTC GTCTGGGTCG GGTTCAGCAC GTACTGGTCATTGATGTAAC 200 CTACGGCGCG CGATTGGGCC GTTGAACGGA ATGCGGACAG CGACAGCACG250 ATGCGATCAT CGCACGATGA TCAGGTACTG CGTACGAACG ACGTCCGATA 300ACTCGATGTA CAGCTCGGAA 320 249 base pairs nucleic acid single linear NOYES DNA (other) 86 GATCAATAAA TACTTTACGA ACTTCACTGG AGATTTCCCATTTAGTGTCA 50 TTTGGGCAGT TTATAAACAA ACGCGCGGTA GTATAAAGGC AAGCCAGACG 100CATTGATATA CCCGTTAACG CCGACGGGTG ATAAGGAGAT CGACCGTTAT 150 GGCTTTTAAACCTGGCAAAT AGGATTGCAT TATTCCAGCC ATGAAGCGCT 200 GGCCATCGCG TTATTCACGCGCATCGGCTG ACACGCACTG TGCACTGCG 249 275 base pairs nucleic acid singlelinear NO YES DNA (other) 87 GATCGCCTTT TGCTGCCAAC GCTGCGGGAG AAAGAGCAGAAAGAGCGAAA 50 ACAGCTGCGA CAGCCGCCAG AGTCGATTTG AGCATGAGAT TTCCTTAAAG 100AGAGCAGAAA TAAAGCAAGT GGAATGATTT TAAAGAGCCT TCTGGGCCAG 150 GCAGCCTTTACTATTTACGT ATATGAACAA TGTACGTTAC GACGACGCGT 200 ATCTGCATAT GATGTGACAACATAATAATA AATGCATGAC ATACTATACT 250 ATATATTAGC TACAAGCTAT GCTCA 275 325base pairs nucleic acid single linear NO YES DNA (other) 88 GATCGCCGCGAACCAGCAGA GCCACCAGCG GAGACTTGCT GTCTTTCACC 50 GCTTTCACCA GCAGCGTTTTTACCGTTTTT TCAATTGGCA GGTTGAATTG 100 TTCCACCAGC TCCGCGATGG TTTTGGCATTTGGCGTATCG ACCAGAGTCA 150 TTTCCTGCGT CGCGCTGCGC GGCTTTGCGG GATAGCTTCTGCAGTTCAAT 200 GTTAGCCGCG TAATCAGAAA CATCAGAGAA AACGATATCG TCTTGCGCTT250 TGGCAGCCTG GAATTCATGC TGGTTGGCGA TAGACGTATG CTGTACGGGA 300ATCAGCCATA GTGAGATACG CTATA 325 230 base pairs nucleic acid singlelinear NO YES DNA (other) 89 GATCGATACG ACGTTCAAAG GATTCAAACC GCGCCATGGCTTCATCCAGT 50 TTGCCGCTGT CAAGCTGACG ACGGACATCG CGGGAAGAAC TCGCCGCCTG 100ATGACGCAGC ATCAGCGCCT GCGGGCGAGC GCGCGTGTTT CGCTGAGTTT 150 GTTTTCCAGCGTCGCCAATC TCTTTCTTCA TGCGCGCAGT GTCATCACAG 200 CGTGACTTCT GTTCAGCTAGCATAATCGTC 230 146 base pairs nucleic acid single linear NO YES DNA(other) 90 GATCCCATCG CTTTTTCAGA TATCATGCAC TTTTTGCACT CAATCTGCGG 50CAAATCCGAC CACTTTTTGC TCAGCCAGAA TGCAGTATTT CCGTCATACA 100 TCGATTAGCTACGACTCTAC GAACTACCTC GACCACAAGA TCACCG 146 184 base pairs nucleic acidsingle linear NO YES DNA (other) 91 GATCTTTGTT AATAACAGTG AGAGAACCGTACGAATGTAG AAGAACTCCC 50 GCCAGGCGGC AACATCTTTC ATAGTAGACC AAGCGTTAACCCCTGCTGAT 100 GTAAAAACGC TTCTATCTCT TGCGCACCAC GGAACGGAAG GTTGCGCGCC150 TTTAGCGCTT ACGGCAATAG CCGCGGCGGA TGGG 184 311 base pairs nucleicacid single linear NO YES DNA (other) 92 GATCAAACAC ATGAATACCGAGGCCTTTGA GTTTTTCAGT CGAGGCGTCC 50 GAGCTGGAGA CCGCGCCTTC AATCTGGCCTTTCATTGTGC CCAGCGCATC 100 AATAAAGTCT GCGGCCGTTG AGCCTGTACC AACGCCCACAATGGTGCCGG 150 GCTGTACTAT CTGAAGTGCC GCCCATCCTA CCGCTTTTTT CAGTTCATCT200 GCGTCATAGA TCGTTAGAAT GTGTGTGAAA TACGCCGCAT TATAGAACAT 250GTCCGGGAAA ATCTCGGTCG TACACAGCTA CGATTCGATT GCGCGCAATT 300 TTGAGGGAAA A311 448 base pairs nucleic acid single linear NO YES DNA (other) 93GATCCTCGAT TAGGGGAGGC GCTAATTGAA TGTGGCGAGG TGTAAGAAAG 50 CAGAAAAGCAAAGTGGGTTC TCGTTGCTCT GCATGTCGTC AAATTCAATT 100 AAACGCATAA AAAAACCCCGCCGGGCGTTT TTCTTCAACT TCCAGGCGAT 150 TACGGCGAAC GAAGTCGATG TGAGTCAGCTTCGGTTTGTA AGCGTGACCG 200 TGTACAGCCT GAGCTTTAAC TTTTACTTCT TTACCGTCAACAACGAGGGT 250 CAGAACTTCG TGTAGAATTC AGCTTTAGCT TGCATGTTCA TCACCTGGTC300 GTGGTCAGTT CGATAGCAAT CGGGCTTCAG AACCGCGTAG ATGATTGCCG 350GACTGTAGCG CGCAGGCGGC AGCTCCTACA TGCTCTTACG TACTCTGCGT 400 GATAGTAACATTAATCTCTT ATATCTGCAG ACTGCACGAG ACTCGTCG 448 359 base pairs nucleicacid single linear NO YES DNA (other) 94 GATCATATCG ACGGTATCGGCGTAATTATT TTGCAGATGG CGTAACACAT 50 CCAGATTATC TCCGGTCAGA AAAAGATTATGGCTGTTTTT ATTTTCTGCC 100 AGAGTATTGT GTTCCACGTC AGGAACGATA ACGGTAACGGATTTTTCACC 150 CGCCTGTTTT TTTGCCGTAA TCTTTGCCAA TAAAATCAAT CTGATAACCG200 CTAGTCAGCT CAATATTACG CGCTTTCAGG CGCTCAAATC TGGCGAGATC 250AATCCGCCTT TCGCGATCAG TTCGCCCTCT CGTTATAGCG GATCGCGGTA 300 AAAATTCCGCGGTAATCGCA GTTGTAACTC AGACAGAAGC GCGTATTCGG 350 CGCAGACGC 359 298 basepairs nucleic acid single linear NO YES DNA (other) 95 GATCCAGTTTAACCTCTGGC TGCCAAATCT TTCTGGAAAA CATGCGGTGC 50 GTTTGGCGCT TCGAAAGAAACATCCTGGTA TAGATACGTT GGATCTGGAA 100 AGCCATTTCA GTGTTATTTT TGTTCTGACATGTGTAAAAC CCTTTAGTGT 150 TGTTCCTTAA ATACTTGAGT AACGCCTTAA CGCAACAGCGGATCCAGTCC 200 ACCACGCGCA TCCAGCGATA CAAGTCGTCA CAAGCGCAAT GTGCTGTGCC250 TCAATCAAAT TTGCGACGTC GTCGCACTAC GTTGATATCT TTACGTCA 298 217 basepairs nucleic acid single linear NO YES DNA (other) 96 GATCGTAAGAGTCAGAAATA AGCAGGCGTA ATGTTGTCAT AGTGGTTTTC 50 CTTACCTTTA TTAAGCCGTCATTTTACTCT TTTTCCTCAC GCTCTTCCTC 100 TTCCGGAACA GGCTTGCTGG CCGTTAGCAGGAAGGGCGAC TGCTGCCAGC 150 GGGTGCGTTT ACCTTGTAGC AAGGTGNNNC AGACACCACGCCTATCGCAG 200 CGAGAGTAGC AGCATCA 217 335 base pairs nucleic acid singlelinear NO YES DNA (other) 97 GATCGAACTC TTTAAGCAGC ATCTTGGTAT GGAAAATATTTTCCTGATAC 50 ACGTTTACAT CCACCATGTC ATACAGCGAC TTCATATCTT CCGACATAAA 100ATTCTGAATA GAATTAATCT CATGATCGAT AAAGTGCTTC ATACCGTTGA 150 CGTCGCGTGTAAAGCCGCGC ACGCGTAATC GATGGTGACG ATATCGGACT 200 CTAGCTGGTG GATCAGGTAATTGAGCGCTT TTAGCGTGAA ATCACCCCGC 250 AGGTTGACAC TTCGATCGTC GGCGGAAAGGTGCATAGCCC GCCTTCCGAT 300 CGCTTCGATA GGTATCGACG CAGATATGCT CTATG 335 352base pairs nucleic acid single linear NO YES DNA (other) 98 GATCGTCGTAGCTGCCGGCA TTGTGGTTGG GTAAATACTG GCGGCAAAAC 50 GAGACTACGC CAGCGTCTATCTCTACCATG GTGATGGTTT CGACGTTTTT 100 ATGCCGGGTA ACTTCACGTA GCATTGCGCCGTCGCGCCGC CGATAATCAG 150 AACGCTGTTT CGCATGACCG TCCGCCACAG CGGGGACATGGGTCATCATT 200 TCATGATAAA TAAACTCGAC GCGTTCGGTC GGTCTGTACC AGCCGTCCAG250 CGCCATCACG CGGCCAAAAG CGGCTTTTCA AAGATGATTA AATCCTGGTG 300ATCGTTTTCA TGATACAGAA CTTGTCTACG GCAAGTCATG ACCAAACTGG 350 TC 352 127base pairs nucleic acid single linear NO YES DNA (other) 99 GATCTGTTTCGGGAAGTGAA CTTAAGGCCT CCGCAATATC ATTTATATAA 50 ACTGACATGG CATTTTTAAACTGCTCAGTA CTGCGTTTAC ATTTGTGGAA 100 GATAGTCTCT GAGAGCAGAG TTTCTTT 127345 base pairs nucleic acid single linear NO YES DNA (other) 100GATCGGCAAC CTGCATTGCC AGTTCGCGGG TTGGCGTCAG GATCAGAATG 50 CGCGGCGGCCCCGATTTTTT ACGCGGAAAG TCGAGCAGGT GCTGCAACGC 100 CGGCAGCAGA TATGCCGCCGTTTTACCGGT GCCTGTCGGC GCAGAACCGA 150 GTACATCACG GCCATCGAGC GCAGGCGTAATGGCGGCGCT GAATGGCGTC 200 GGGCGAGTGA AACCTTTATC CTGGAGGGCA TCCAGACAGGCTTTCGTCAG 250 ATTCAAGTTC GGAAAAAGTG TTACAGTCAT GTCTACCTCT GTGTGGGCGC300 TGATTATAGA CTTACGCGCA TCTCATCTGT GATGATATCT CTCAG 345 250 base pairsnucleic acid single linear NO YES DNA (other) 101 GATCCGGGAC ATTCACGTTGAGAATACGCC CGGTACGCAA CGGCTCCCGG 50 CTTAACCCTC GCAAAAGCGC ACAAGTCACGGCCGCAGCGA TACATAATGC 100 TGATAGCCGT TAAGGGAGAC CGCTAATGCC GGAAAGCCGAGATGACGACC 150 TTCATCGCGC GCACAGTACC GGAATAGATC AACATCATCG CCAGATTCGG200 ACCGCGTTAT ACCGGAAACG ACATATCGGT GACGATTAGC TTACGCAGAT 250 333 basepairs nucleic acid single linear NO YES DNA (other) 102 GATCCCGGCTTACGACGGTT GGCTGGATGA CGGTAAATAC TCATGGACTA 50 AGCTGCCGAC ATTCTACGGCAAAACCGTCG AAGTCGGGCC GCTGGCGAAC 100 ATGCTGTGTA AACTGGCTGC AGGTCGTGAATCCACGCAGA CCAAGCTCAA 150 TGAAATCATT GCGCTTTATC AGAAGCTGAC CGGCAAAACGTCTTGGAAAT 200 TGGCGCAACT TCACTCTACG TGGGTCGATA CATCGGGCGT ACCGTTCACT250 GTTGTGAACT GCAAAACATA TTGCAGGATC ATACAGCTGA TTGTAATATC 300GGCAAGGATT ACACCAGTTT GAGACGGCAA TCG 333 284 base pairs nucleic acidsingle linear NO YES DNA (other) 103 GATCCAGCCA GACGGAACCC CACGGCGGCGGAGACGGCAG AGCGTAAGGG 50 CCGATAAACA GACGCTGCCA GGCCTGTGCA ACGACTCTTCGCTGTGGGTC 100 TTAAACATAG CCGCCACAGG GCAAGGCTCG GCATCAAGCG GCCACTGCGC150 CTGCAGTCGT CGTTTAATAG TCGTCCTGGA CCAGAGGAGC GGTTTCGTGG 200CTTTCCGCGA ATAATAAAAC AAGTGCCAAG AACAGTGTTA CTGCAAATCA 250 TCTCGTTGTAAAAAGTGTAT TAAACATCCG TAAA 284 249 base pairs nucleic acid single linearNO YES DNA (other) 104 GATCAACGCA AACAATCAGA ACCTCTGCTT CATTTAGCAGCGTGTTCTCT 50 GCGTTGACAA TGCGTTGCGT GAAAACCAAA GCGGTGCCAC GCATTGACGT 100AATTTCTGTT TGAGCTTCAA GCATATCGTC GAGCCGCGCA GGCCATAGTA 150 TTCCAGCTTCATCTTGCGCA CCACAAAGGC TACCCGCTCC GCAGCAGCAC 200 CTGTTGCTGA AGTGATGGTGGACGTCAGCA TCTCGNNNTC TTCATAAAA 249 248 base pairs nucleic acid singlelinear NO YES DNA (other) 105 GATCCCTTTA CGACCAGGCG TCCCGGCGCCGTTATAGTGC CAGCCAAAAC 50 CAAAGCCGCC GCCCGGTAAA CCAATCTGTT CCAGCATTGCGGCCAGCACG 100 ACGACCATCC ATGACCACTG TTCGCATGCT GCATACGTTG TACGACCAGC150 CAGCGATGAT TTCGGTTCTG TCGTCGCATC TGTGGCAACG CGACTGGGTG 200GTGTAATCAA GATCATTTCG CAGGACTTGG TGCATTGTAG AATCGAGA 248 175 base pairsnucleic acid single linear NO YES DNA (other) 106 GGCGGAGGAT TGCCACGTNGCAGCCTGCTA CGCCCGTCAG TTCTTTACGC 50 AGGTTAGCCA CCAGTTCGTT TACCATGTGGCGGCTCCNTG TCAGTTTCCA 100 GTTACCCATC ACTAAAGGAT GTGATTTATT TNTCCACGTTAGTAGCGAAT 150 TAAGGAAGAT GGCCGCTCGT AGAGA 175 307 base pairs nucleicacid single linear NO YES DNA (other) 107 GATCATTATC TTAACCTAAAACCGCTATAT TTATAAGTAT TATTACGAAT 50 AATCTTAACC TGGGATATGT TATACTAATCGGACCAGAAA GATATTATTA 100 CGACTTTAGT AAATGCTTTT TAAATATTAA ATAATAATTAATTAAGATTT 150 CTACCATTCA TTAATTATAC TTAACAATAG TTTCACACCC CGCGCCGGAA200 AGGTCTAACC TTCTCATTTA CCTTTAATAC TCAGTATTCC CGAATAGCCG 250ACCGACACTA ATGATGAATG CTTATCTCTC ATAAACCAGA TATTATGACA 300 CATAACC 307234 base pairs nucleic acid single linear NO YES DNA (other) 108GATCAGGATA TGCCGCCGCC AGTAGCGATA GGGCGTCAAC CTCGTGCTTA 50 TCGGTGATGAGCGGCGCGTT GGCCGGGGCT TTTAAAAACG AAAGCATTAT 100 CCTTCCTTAA ACGTAACGCTGGGGCAACGA GACGCTCACC CGCGTACCGT 150 GGGTACAAGA GATGGTTAGC GTCCGCCGAGCGACGACACG CGCTTCGCAT 200 TCGGTCAGGC CGAAGCCTCT TGGTGAGACC GCCG 234 352base pairs nucleic acid single linear NO YES DNA (other) 109 GATCGAGCGCGGAGAACGGT TCATCCAGCA GCAGTACCGG CTGTTCGCGT 50 ACCAGGCAGC GCGCCAGCTACCCGCTGACG CTGGCCGCCG GACAGTTCGC 100 CCGGTAAACG CGTCATCAGA CTCTCAATGCCCATCTGATG TGCGATAGCT 150 CCCGTTTTTC CCGCTGGCTG GCGTTGAGCG TTAACCCAGGGTTTAGCCCC 200 AGACCGATAT TTTGCCTGCA CATTCAGGTG GCTGAATAAA TTATTCTCCT250 GAAACAGCAT TGAGACCGGA CGGCGTGAGG GCGGCGTAAG CTATGATCGT 300CGGCAATAGT AGCGTACGCT GGCCAGGCGC AAGAAACCGC ATAATCTCTC 350 TT 352 168base pairs nucleic acid single linear NO YES DNA (other) 110 GATCAGGGTCAGACGCTTGT GCGCCCATAC AACGTTTTGT TCCAGTTGGC 50 CTTTCTCGTT AACGTTTTGGGAGCGCCAGA GCTGTTTAAC GCTCATGGGG 100 CATTCCAGAA CGGGCAGTAT CTCTTCAAAGGACGTTATCG TTTGTCAACG 150 GCGGACAGCA TTTTCAAA 168 211 base pairs nucleicacid single linear NO YES DNA (other) 111 GATCTTCGGG GCGCACCCACGGGGTTTTTG CGCGGGGGAC GCCTGTGTTA 50 TCAGCATTGT AGAAACTGCG ATAGATATTTCCGGTGAGGC AATTTTCGCT 100 CGGCACGATG TGTCGCTTAT CCGGTATGTG GTGAGCAGTGTGCGCCGGGG 150 CGTGTGATAG AGCCATTGCG CGATGGATCG TCTAGTGAGT TTCTCAGATA200 GGGGGTGACG A 211 257 base pairs nucleic acid single linear NO YESDNA (other) 112 GATCCGCAGA TCCATCTAAT CGGATTAGGC GCATACTGGT AAAGATTCAG50 CCCCCCCGCC AGCCCAATCG GATCCTGACT GACGAACCGT CCACACTCCG 100 GTGCATAATATCTGAACAGA TTGTAATGCA GCCTGTCTCG TCGTCAAAAT 150 ACTGCCCCGG CAGCCGCAGACCGGCTGGTG AAGTACGCCC GCTGTTGCTG 200 ATGTCCGCCG CATTTCTCCA ACCCTGATATACCGCCACAC AGCGTCGTCG 250 CGCGTAC 257 359 base pairs nucleic acid singlelinear NO YES DNA (other) 113 GATCCTGACT GGTACGACTT AACGTTTTAGGCTCGCCAAA ACTCAGCCCC 50 GCCGCTTTCA TCGCTTCCGC GCCTTTGCCC GCTTTCAGCTCGACCAGCAG 100 TTTTTCCGCA TCCAGCTTCG CCTGTTGTTC CGCTTTATTA TGCTTCACCA150 GGGCAGTGAC CTGTTCTTTC ACTTCTGCCA ACGGCTTCAC GGCTTCAGGT 200TTATGTTCGC TCACGCGTAC GACAAAAGCC CGGTCAACCA TCCACGGTGA 250 TAATGTCTGAATTCGGCCCG GCGTACCGTT TGCACAGACG CATAAGATAG 300 CATCGGCTAA CGTTGAAGTCAGCCTTCGGT AAGGTGTACG GCTAACAGCG 350 GTTACGCTT 359 427 base pairsnucleic acid single linear NO YES DNA (other) 114 GATCGCGTAC CGCCAGTAACGCCGCCGCTT TACCGTCAAT CGCCAGCAGG 50 ACCGGAGTCG AGCCTTGCGA GGCCTGCGCGGTGATTTCCG CCGTCATGTC 100 ATCCGTGGCG ACGTGCTGTT CGTTCAGCAA CGCCTGGTTCCCCAGAAGCA 150 GTTGATGACC TTCCGCTTCA CCGCTGACGC CCAGTCCGCG CAGCTTCTGA200 AACCGTTCAC CTGCGGCAGT TTATCATCGC CGGCTTTTTC CAGAGAATCG 250CATGGGCCAG CGGGTGGCTG GAGCTTGTTC GAGCGCGGCA GCCAGACGTA 300 ATGCCTGAGCTTCTCAACGC GTTAAAGGTT TTATCGCACA CTTGCGGCTT 350 GCTCGTCAGC GTCCGGTTTATCAAACTGAG GTATCAACGT ACTGGCGCGT 400 GCAGGATGGC ATGTACAGAG CGATGAG 427299 base pairs nucleic acid single linear NO YES DNA (other) 115GATCTGGAGG TAGAGGTTAT CGAGGCCAGC GGTAAAACCT CACGTTTCAC 50 CGTGCCTTATTCTTCCGAGC CGGATTCGGT TCGCCCCGGT AACTGGCACT 100 ATTCGCTGGC CTTCGGCAGGGTTCGTCAGT ACTACGATAT TGAAAATCGT 150 TTCTTTGAGG GAACGTTCCA GCACGGCGTTAATAACACCA TTACCCTCAA 200 CCTCGGTTCA CGAATTGCGC ACGGTTACCA GGCATGGCTGGCGGGCGGCG 250 TCTGGGCCAC CGGTATGGGC GCGTTCGGCC TTAACGTCAC CTGGTCGAA 299339 base pairs nucleic acid single linear NO YES DNA (other) 116GATCAGAGTA AAACCTGGCT GCTATGGTGC GAACGTGGCG TAATGAGTCG 50 CCTGCAGGCCTCTATCTGCG CGACGAGGGG TTTGCCAATG TGAAGGTGTA 100 TCGTCCGTAA TTCCTTTGCCGGGTGGCGGC TATGTCCTAC CCGGCCTATC 150 GTTTTATTTC TGCCCCAACC GTTTTGCAATGCGCTCCAGC TTCATCATCA 200 GCAGCAGCGT AATGGCCACC AGCACAATGG TCAGCGCGGCGTCAGCATAT 250 TTCACGTCGG TCAAGCTAAA GATAGCCACC GGCAGCGTCG TCAGCCGGCG300 ATAATCATCA TCGTGGCCAA CTCCCATGAG AGCATAACT 339 378 base pairsnucleic acid single linear NO YES DNA (other) 117 GATCGATATC AGGGAGGAAGTGGTTGCCCG CCACCAGCGT ATCGGTACTG 50 ATCGCCAGGG TCTGCTTTTC AGGAATATCAGGAGCGCGCA ATCGTCGCCA 100 ATACCGGTTT CAACATCAAG ACGAGAGCTT CTTACACGGTCAAAATAACG 150 GGCAATCAGG GAAAACTCGC CACATGCCAT ACGTTATGCC TCAGCAGAAA200 AAAAGAAAAG GCCGGAGACG CGGGTATCGA GCGCCCGCTA TCTTTCCGGC 250CTGTGAATCA CTTTTTGTTG GGACGAATCA CCGGAGCTGC TTTATCAGTA 300 CGCGTTGACGATTTGTGGCT GTCTTCACGC GCCAAAGTTT GAGTTCATCG 350 CTTCGTTGAT GGCCATTATAAGCCAATC 378 266 base pairs nucleic acid single linear NO YES DNA(other) 118 GATCTCTTAC GATAAAGAGC ACATTATCAA CCTTGGCGCG CCAGATTGGT 50ACGGAAGATT TTGCCCGTGC GATGCCTGAA TACTGTGGCG TGATTTCAAA 100 AAGTCCGACGGTGAAAGCCA TTAAAGCGAA AATTGAAGCC GAAGAAGAAA 150 ACTTCGACTT CAGTATTCTCGATAAGGTGG TAGAAGAGGC GAACAACGTC 200 GATATTCGTG AAATCGCCAG CAGACCCAGCAGGAGGTGGT GGAGTAGAAC 250 GTGATGATCG GTTTCT 266 345 base pairs nucleicacid single linear NO YES DNA (other) 119 GATCATCTTC CACTTCCAGATGCACCGTCA CATCCGGGTT AGTGAGCTTC 50 ACGCGCGCCG ATTCAATATG CTGATTTAATCCGCCGCCAA CATAGCGCTC 100 CACTTCAATG GAGCTAAACT CATGCTTACC GCGACGTTTTACCCGCACGC 150 AGAAGGTTTT GCCTTCAAGC TGTTCGCGAT ACTGCGCCAA ACGCTTTCTC200 GAAAATGTCG TGCATATCGG TGAACGGCAC ATCTCGACTT CAAGAATATG 250TGAATCCCGG GATCGTGGTC AGCGCTCGGA ATCACAGACG CTGGTTTCAC 300 TTGCGCGACTCATTTACAGT CAGACACGTG TAGTGCTTAA CTCAG 345 321 base pairs nucleic acidsingle linear NO YES DNA (other) 120 GATCATCCTG GAGGTCTTTA TGGCTGATTTCACTCTCTCA AAATCGCTGT 50 TCAGCGGGAA GCATCGAGAA ACCTCCTCTA CGCCCGGAAATATTGCTTAC 100 GCCATATTTG TACTGTTTTG CTTCTGGGCC GGAGCGCAAC TCTTAAACCT150 GCTGGTTCAT GCGCCGGGCA TCTATGAGCA TCTGATGCAG GTACAGGATA 200CAGGTCGACC GCGGGTAGAG ATTGGGCTGG GCGACGGACG ATTTTGGCTG 250 GTCCTTCTCAGGCGCTATTA GTACGCGGTT CATGCAGTAC ATACTACCTG 300 AAGTCACGAT GCACCGAATA G321 216 base pairs nucleic acid single linear NO YES DNA (other) 121GATCGGCGCG CGTATCTCAG GCATGTGCGC CGCCAGTTGG GAAACGCGCC 50 CGCCGGGGCCCTCAATTTCA TACGCAGAAT ATCCGCGCGC GCCGACCGCG 100 CCGGCAACGG CGCGGCAGACATTGACGCCG GCGGGCAGCT CGCGGGCTGT 150 GGCAGAAGGG CGTCACGCTG CCAGGCCTCGTCTGGATAGA TTGATATTCT 200 CGACCACATC CCGAAA 216 292 base pairs nucleicacid single linear NO YES DNA (other) 122 GATCGGCAAA CAGATAGTCCTGCGACGCAT TAAATCCAGG CATTGCCGAG 50 GAGCACGCCG AAGCGGATAC GCCAGGCGGGCAGGCCATAC CTACGGTATT 100 TGTCAGACCA AACGCCTGCG GGTTGGCAAG AATTTCCTTAAAGAGGCCGT 150 TGATATCGGC ACGGGCTATA TTGCCGCCGT GTTGCTCCAG CCCCTTCTCT200 TCCATCTGAT TATAATAATC GGTCAGAGCT GACGCTGCCC TGCCGCCGTT 250CATAGTTGCA GAGTGTCACG AGCAGTGTGA TAATGATGGG TT 292 109 base pairsnucleic acid single linear NO YES DNA (other) 123 GATCAGCGCC GCGCTACGTTAATAGCCGGT TGCGACGACC GTGGACGCTA 50 GCAGAGTCGC GGATGACTTC CGTATCGGTTGGTCCACGCG TGAAATTAGT 100 TGCGCGACA 109 258 base pairs nucleic acidsingle linear NO YES DNA (other) 124 GATCGGTCGC ACGCCGGAAT ATCTGGGGAAAAAAATCGGC GTGCGTGAAA 50 TGAAAATGAC CGCGCTGGCG ATTCTGGTCA CGCCGATGCTGGTCTTGTTG 100 GGTTCGGCCT GGCGATGATG AACGGATGCC GGACGCAGCG CAATGCTGAA150 CCCTGGCCGC ACGGTTTTAG CGAAGTGCTA TATGCCGTCT TCCTCTGCCG 200CCAACAACAA CGTAGATTTT TAGTCTACCT AACTACTTCT GAACTACGGC 250 ATCTCGAC 258384 base pairs nucleic acid single linear NO YES DNA (other) 125GATCGTTGGT CTTTAAGGCC GCCGCCAAAT CGCTGTCGAC CTGCTTGTTG 50 CTGTAAAAAGCGGTATTAAA CTGCGTCGGC GGCCAGTTTT GTGATGCGAA 100 GAGCGGCGAT AACGCCCAGTCAGCTTCGCC CGTCAGACGC CGACCAGCCT 150 GTATAGAACA TTCGCACGCG CTCTCTTTTTGCCCTTTGCC CTCGACTTCC 200 GCGGCGGCTG GCCGGCGTAC ATCGCGGTTA TCCGGGCTTTAACGACCAAT 250 CTGCGCCAGT TGCTGTTGGG TAAACTGCAA GAGTTTTTGG GTGCTATGGT300 TGTGCATGAC ACAGCGTGTA CTGAACGTCT GATACCGCTT TCACGTCCCC 350TAGCGATCAT GGCCAGTGAA GTTGCATAGC TAGA 384 448 base pairs nucleic acidsingle linear NO YES DNA (other) 126 GATCATACCT TGCTTGATGA CTGCGCCACTAAAAACCTGA CGCCGGCGAA 50 AACCCACTGG GCGCGCCCGC TTGATGCGCC GCCCTACTACGGTTATGCGC 100 TGCGACCCGG CATCACGTTT ACCTACCTGG GTCTGAAAGT CAATGAACGT150 GCCGCGGTGC ATTTGCCGGT CATCAAGCCG CAACCTGTTT GTTGCCGGCG 200AGATGATGGC AGGAAATGTT CTGGGCAAGG GGTATACCGC AGCGTAGGCA 250 TGTCTATCGGCACAACCTTT GGCCGCATTG CAATAGAAGC CGCCCGCGCA 300 CAAGGAGGCG CACGATGAAACAGCTTGAAA ATTATCATTG AGGCACGTGC 350 TTACGAACGA AGCGAGGTGA ACTGTCATGCAGTGTGTACG TGTGTGCTAC 400 TCGAAGGTTT GCGGATTCGC ATGACAGGTG ATGTAGCGATATATCGAT 448 448 392 base pairs nucleic acid single linear NO YES DNA(other) 127 GATCCCCAGG AGGTCTGGTT TGTCAAATCG CCGAAATCCT TTTTAGGCGC 50CACGGGCCTG AAACCGCAGC AGGTCGCGCT GTTTGAAGAT TTAGTCTGCG 100 CCATGATGGTACATATTCGT CATACGGCGC ACAGCCAATT GCCGGACCGA 150 TTACCCAGGC AGTGATCTGCAGGTGGCACT TTTCGGGGAA ATGTGCGCGA 200 ACCCTATTTG TTTATTTTTC TAAATACATTCAAATATGTA TCGCTCATGA 250 GACAATAACC TGACAAATGC TTCAATAATA TTGAAAAGGAAGAGTATGAG 300 TATTCAACAT TTCGTGTCGC TTATCCTTTT TCGCATTTGC TTCCTGTTTG350 CTCACCAGAA CGCTGGTGAA GTAAAGATGC CTGAAGATCA GT 392 327 base pairsnucleic acid single linear NO YES DNA (other) 128 GATCTTGTCA AGCTGGTCAGCATATCCCGG ATATCCTCCG CCTCCCCCCC 50 CGCCACTCCG CGCGGCTTAT GAATCATCATCATGGCGTTT TCCGGCATAA 100 TGACGGGATT ACCTACCATC GCAATAGCGG ATGCCATTGAGCAGGCCATT 150 CCATCGATAT ACACCGTTTT TTTCGCCGGA TGATTTTTCA GGAGGTTATA200 AATGGCTATT CCGTCCAGTA CTGCTCCGCC AGTGAATGAA TATGCAGATT 250TATACGGTTA ATCTGTCCAG TGCAGCCAGT TCTCTGCAAA CCAGCGAGCC 300 GAAATTCCCATCTCAATCTG TCATAAT 327 306 base pairs nucleic acid single linear NO YESDNA (other) 129 GATCCGCAGG AGAAAACACG ATTGTACAAA GAGGCGCAGG ATATTATCTG50 GAAAGAGTCG CCCTGGATAC CGTTGGTGGT GGAGAAATTG GTTTCTGCTC 100 ACAGTAAAAATTTGACCGGT TTCTGGATTA TGCCGGATAC CGGTTTCAGC 150 TTTGACGATG CGGATTTAAGTAAGTAATGC GATGGGGCTG GATGGCGCGC 200 GGTTGTCGCC ATCCGTAAAA GGTTCGTGTATGCTAACTAT GTTCTCAGCG 250 CTGCTGGATT ATTCTACGTG TTGATTGTGC AGTGCTGGTGTTTATTGTCA 300 TTGTCC 306 301 base pairs nucleic acid single linear NOYES DNA (other) 130 GATCTCAGCG ATGTTCAGTT AAACGCTGTG CCGGATGCGGCGTAAACGTC 50 TTACCCTGCC AACGGGTTGG GTAAGCCGAA TAAGCGCCGC TCCATCCGGC 100AGCATTCACA TAAAGTCCGG CACCAGACGC TGTAACGCGC CTTGCGCAGC 150 AGCGCCGTCGCACACTCAAT ATCGGGCGCG AAAAAACGAT CCTGCGTATA 200 GTGCGCCTCC TGCTCGCGCAGTGTCTGCCG CGCCTGTTCC AGTAACGGGC 250 TGGAGGTTAA CCTTCCGTAA TTATCCTGACAGCAGCAGCA TCACGCATAT 300 G 301 329 base pairs nucleic acid singlelinear NO YES DNA (other) 131 GATCGCCGGT CAGTTCCTCC ATTAAGAGCGGCGCGCGCGC CAGCATCTCC 50 ATGCAGAAGA GCCGCGACGC CTGCGGATAA TCACGCGAAACTTCCAGCTT 100 GAGACGGATA TACTCTTTGA TGGCCTCCAT AGGGGAAAAT TCTGCGCGAA150 ACGCTTGAGC GGCGCACGAG ACATCCAGAA TCTCGTCGCA TTACCGCGAC 200ATACAGCGCC TCTTTCGAGG GATAATAATA AAGCAGATTG GTTTGGAGAC 250 GCTGCCGTAGCGGCGACTGC TCAAGACGCG CGATGATGCA TACTGGAAAC 300 ACGAGCGCGT AGATAGCTGCGTTGCACGG 329 266 base pairs nucleic acid single linear NO YES DNA(other) 132 GATCCGCCCA CGCGTTAAGG GCCGTAAACA GAGCGTCATT CATCATTACC 50GCTGGATTCA CCGCCCTTCG TTCTTCTTCT GTTAACACCA CGCGTAATCG 100 CAGACAGGCCGGGCCGCCGC CGTTGGCCAT ACTTTCTCGC AAATCAAACA 150 CCTGCATCGC GCTGATGGGGTTATCCTCCG CCACCAGCTT ATTCAGATAG 200 CGTCCAGACG CGACATGGTC TGACTTCCGCGCACCTACGC TTGAGCCGTG 250 TTCGCTTGCA CTGCTT 266 319 base pairs nucleicacid single linear NO YES DNA (other) 133 GATCAAATGC AGGCAGTAAAAGGGCGTCAT CAAGATTATC GGTACACTGT 50 GTAGCGGCGG TTTGCAGAGT ACCATGTAGCGCCGGATAAT TATGCCGGGT 100 CAGGTTGACA CCGTGCGTAC CGTTAATAGC TTCAAAGGCGTCGCAAAACG 150 CGCGGTGTTT TTCTGCGGTG ACGGGGTCTC CCGGCGCTTC AAAAGTTCGC200 ATCAAATGCG GGCGATGCTC TGATTCTGGT ACTTATCGTA CAAAACGACG 250ATCGCTCTCT CATGATATAC GCATATAGCA TCATGCCTGT CCGTGCATAG 300 TCGTAACTAGAGACATCAC 319 438 base pairs nucleic acid single linear NO YES DNA(other) 134 GATCAACCTG AACTCAACGG ACCCTGTACC GTCTAAAACG CCCTTAGCGT 50GAGTGATGCG GATTCGTATA ACAAAAAAGG CACCGTCACC GTTTATGACA 100 GCCAGGGTAATGCCCATGAC ATGAACGTCT ATTTTGTGAA AACCAAAGAT 150 AATGAATGGG CCGTGTACACCCATGACAGC AGCGATCCTG CAGCCACTGC 200 GCCAACAACG GCGTCCACTA CGCTGAAATTCAATGAAAAC GGGATTCTGG 250 AGTCTGGCGG TACGGTGAAC ATCACCACCG GTACGATTAATGGCGGAGCC 300 ACCTTCTCCT CAGCTTCTTA CTCATGCAGC AGACACGGGC TATACATGGA350 CATCAAACGG CTATAGGGGA CTGTGAGCTA CAGATTACAC TGATGGCACG 400TGTTGGCACT ACACGCGCGT TCGGCGATGT GTATGAAC 438 363 base pairs nucleicacid single linear NO YES DNA (other) 135 GATCTTATCC TTCCGCTACAAAATCAACTG CGCCATCTGA CGCATATTGT 50 CGGCGTGGAT AAACTGGCGG CTGCCACCACAGCGCTTGCG TTAGTCAAAT 100 CATCGACCGC AGCGAACCGT TGCAGTCAGA CATTAACATTCACGGTGATG 150 AACTGGCGGC AGTGCTGTTT ACCTCCGGCA CAGAAGGAAT GCCGAAAGGG200 TGATGTTGAC CCACAATAAT ATTCTTGCCA GCGAACGGGC GTATTGGGGG 250TTGAATTTAA CCTGGCAAGA TGTGTTCCTG ATGCTGGCGC ACTGGGAGAC 300 CGGATTTTAAGGAGGCTTTT ATGGGGTAGT ATTGCTGGAC ATCTTACCAG 350 AGCTCTACTA TAG 363 347base pairs nucleic acid single linear NO YES DNA (other) 136 GATCGATTTTCCCCTCCATG TTTTCATAGG GGAACAGGTT CGGGTTAAAA 50 ACCACCTGAC GGATATCGCACAAAAAGCCA ATCCGCTCCG CCCAGTAACC 100 GCCCAGCCCC ACGCCACAGA TTAAAGGGCGCTCGTCCACA TTCAACTGCA 150 ACATTTTGTC CACTTCTTTC AGCAGATGCT GCATATCGTGCTTAGGATGC 200 CGCGTACTGT AGCTTACCAG CCGAACATCG GGTCGATAAA CTGGTAATTG250 CGAACACTTT TTCATGGTGC GCGGACTATA TGAGTCAAAA CGTGTGATAT 300ATATCATCTG GCACCTCACG AGACTGAGTG ATGCGTGCGT TTCTGCA 347 278 base pairsnucleic acid single linear NO YES DNA (other) 137 GATCCCAGAC AATACCGTTACTGTTATCCA ACGATACCCC TGCCAGTGAG 50 GTACGCAGGA ATCCATATTG GGTGTGATGCGCGTAAGAAA CGCCCGCCAT 100 CATAGTACTT TTACGCCTGT CCAGACGACG CAACTGATGGTCATCGCTGT 150 CGCCCGGTTT GAAGTACATC GGGGACCAGT ATGCCATGAT TGACAACTTA200 TCGGCATTGT CATTCACAAG TAGTACCGCG CCAGACACGA CAGAGTTNTT 250CATAGGCATG ACGATCGATA ACAGCTAT 278 385 base pairs nucleic acid singlelinear NO YES DNA (other) 138 GATCGTTATG AATCGCTTGC GTGATTTCCAGCGTCACCGG GTCGAGACGA 50 TAAACTACGC CGCCTTTATC CAGTTTACGG CTTTGCGATGTAGCCAGCCA 100 GAGCGCGTTT TCTTGCTGAC TCCAGGCCAT CTCATAACGC CTTTGCCTAC150 CGCTTTACGC AGCATGTCTT CCGCGCCAGC GTGCTAAATG AGGATGCGAC 200GAGGAGCGAA CCTAACAATA AAGAACCACG CAGGCTGGCG AAAAAAGATG 250 ACGTAAGTGCATGACGACTC CTTTGATAAA ACGTGTATAG CTGCTTCACA 300 CTACTTCGCT GCGTGGATCTGCAGGTGGCA CTTTTCGGGA AGTGCGCGAC 350 CCTATTGTAT TTCTAATACT CAATATGATCGTTAT 385 282 base pairs nucleic acid single linear NO YES DNA (other)139 GATCAGCGGC TATGGCGGTC CGGAAGGCGC GAAGATGGCA CGCCGGCGGG 50 CACAGTTTGGTTTGCCTGGA ATATTAACAA TACAACTTTT ACAAGCCGAC 100 AACATTTCAA CGGAGATTGTCAGGAAGTAT TGGAAAAATG CGTACGCTTC 150 GCCCTCGCTG AATTGCTTTT CTGTTAACGAAGAAAGCATA ACATAATTTC 200 ACTGACGTCA GATACTCCGG CTAGATAAAT CGAGCTTACCGCGTGTTCGG 250 AATTCGATGA TTCGGATATC GGTCGCCATC GT 282 179 base pairsnucleic acid single linear NO YES DNA (other) 140 GATCGGCGAC TACAAAACCAATCACCGCGG CTTTACCATC GAGTTCCATA 50 TGCGTACGTT TTATCGCTGG GAGTATGGCGAGAATATGTC CCCGGCCGGA 100 TAGAACCGGT TAAAGAGACC ATGCGTTACT TTTTCATGGCGGTATACATG 150 CACAGTTGCT TGGTGGCATG ACATTGGAA 179 261 base pairsnucleic acid single linear NO YES DNA (other) 141 GATCAGTAAC AGGACGGTAGCAAAATTCGC ACTGAGCCCG GCGACATTCT 50 GAACGAACGG TTCAATATAG CTATAACTGTGTAATGCGCA GTCACCACAA 100 CGACGGTCAG TACATAGAGG CTCATCAGCG CCGGGCGTCTGAATAGCAAA 150 AGGTAAACTT TTTAGTGAGC CGGAATGCTC GTCTGGCAAT TTCGGTAGAG200 CTTATCAGAA TAGCAGCGTA TATCTCCATG CGATGCAAAG TGGCCCAGCA 250AATCTGACAC T 261 225 base pairs nucleic acid single linear NO YES DNA(other) 142 GATCATTTTG GTGCCGGTGT CAGCCTGCTG ATGTCCACTG GTCAGCGCAA 50CGGAATAGAA CTCGCCGATA TAATTATCAC CGCGCAGAAT GCAGCTCGGG 100 TATTTCCAGGTAATCGCCGA ACCGGTTTCC GACTGGGTCA ACGACATCTT 150 GCTGTTTTCC CTTCGCACAAGCCCGCTTGG TCACAAAGTT CAGATCGCCG 200 TGTGTGTGCC GGACAGTTGA CGTGA 225 301base pairs nucleic acid single linear NO YES DNA (other) 143 GATCATCCTCGGCGCGGGAG TGAATCACTG GTATCACATG GATATGAATT 50 ACCGTGGGAT GATTAACATGCTGGTGTTCT GCGGCTGTGT TGGACAAACC 100 GGCGGCGGCT GGCCGCACTA TGTCGGCCAGGAGAAGCTGC GGCCGCAAAC 150 CGGCTGGCTG CCGCTGGCTT CGCGCTGGAC TGGAATCGCCGCCGCTCAGA 200 TGAACAGTAC TCGTTTTCTA CACCATGCCA GCCAGTGGCC TATGAAACTG250 ACTGCGCAAG AGTTGCTGTG CGCTGCGATC GCTAATTCGA CTATCGATTA 300 C 301 272base pairs nucleic acid single linear NO YES DNA (other) 144 GATCATGTGGGTTTAACCCG TTGATTAAAC ATTGGATTAC GGAATAGCAA 50 TTGCTTATTT TATTTGTCATACAAATAAGT ATAATACCCG CTTCCGATGT 100 AGACCCGTCC TCCTTCGCCT GCGTCACGGGTCCTGGTTAT ACGCAGGCGT 150 TTCTGTATGG AATACGCCAT CCCCTCTGAT AGATGCCTTGTTGCCTTAAG 200 CAGTTAACCC GCCTGAAGCA AACGACAAGA CGGCAGACGC TTACCGGCAT250 ACGACACGGA TGCTTCAGAA GA 272 358 base pairs nucleic acid singlelinear NO YES DNA (other) 145 GATCTGCGCA CATCATTCGG GTCATCGCTAAATTTTTCAC TTTTAATTCG 50 CCGTCCGACA GTTTTCCTTC GCCGGTGAAT TGATTGCACATTTTGCCGGA 100 TACCGTCATG TCCTCGCCAA GGCTAGAGCT CCGGGCCGGT GACCGTTTTA150 CCGTTTACGC TTTCCAGAAC AAAGCGGTGG TGCTCCAGTT CGTCGCGTTT 200GACGGACACT TTTCACTGCT CACACACCTG TCATTATGAT GCTCAGGGCG 250 ACCAGCGTGATTTCTTCATT GATATTCTCT GTAATCTGAT AGGTTAACAC 300 TGACTATAGT AATGATATGACCGGATAGAT CTTCAGGGTA TCCGAAAATC 350 GTCCCTGA 358 224 base pairs nucleicacid single linear NO YES DNA (other) 146 GATCTGTTGT TACAGCATGGAATGCGCCGT CCTCCTCACC GGCCAGGCAA 50 ACGGCGCGAT CGTATCGAAC TGTGCGCCGCGCCGAAAGAA GGGGGGCTTA 100 GCCCTTCTTT CGGCGTCTTA CGCAGCGTAG CCAGCATATTAGCATTGCCT 150 AACTGCATTA TTGTCTGCGG CGGGGATTTT ACTACGTAGC GCAATTTGGC200 ACGTCTAGAA ATTCGTAAAG GTTC 224 268 base pairs nucleic acid singlelinear NO YES DNA (other) 147 GATCCTGAAT CGCCACGACA CGGGCGCCAGGCCTGCAAAC AGACGCGCGG 50 CTTCGCTGCC GACGTTACCA AAACCCTGAA CCGCAACGCGAGCGCCTTCA 100 ACAGCAATAT TCGCCCGACG TGCGGCTTCC AGCCCGCTGA CGAAAACGCC150 GCGCCCCGTC GCTTTTTCAC GGCCCAGCGA ACCGCCAAGA TGGATAGGCT 200TACCGGTGAC GTAAGATAGT GACCGTGTGC ATGATTCATG GAATACGTAT 250 CATATCATCAATATTACT 268 314 base pairs nucleic acid single linear NO YES DNA(other) 148 GATCCTGAAA AATACCAATT TTCAGCGGGC GAGCTTCGCC TTCCGCACTA 50AAACAGTGAG GAAAACGCTC GGCCAGAAAC GCGATAACTT CTTTACTGCT 100 ATTCAACTTAGGTTGATTTT CCATGAAATT TCCTGATTAC AACGGACGTA 150 GCCAACAAGC AGCAGGCATGAACAGGCGTC ATTATAATGA CGCCATCAGT 200 AATTGCTACG TTATCCGTTG ATTATCCTGCGACGTCGCAA AGATTTTTTG 250 TATCCGTCGT GCAGCACGTT CAGCTGTCAC CAGCGTACCAGGCGTGTCAT 300 CTCTCGTAAC GCAA 314 379 base pairs nucleic acid singlelinear NO YES DNA (other) 149 GATCCAGAAT ATATAAAACC CCATTAACNCCAGCGCGCTT AATAACCATG 50 TGGTCATCTG CGCTCCGTGG CTGGTTACGT TGTTATAAATAAGGATGGCG 100 ACCAGCCCAA CGAAGATAAC GCTGTCTACG CGACCGCGGC GGAGAGGGCT150 ATAGAAAGCA GAGTGGGGCC ATTGCGACGG GGCATGATGA ACTGATCGTA 200GAGAGCGTAA GCCAATAATT CGGCAATAAA GAGAATCAGC ACCAGGTCCG 250 TGATAGTCATTTATCTCAGA GAAATAAAAA ACGGGCGTTT GCGTAGTGTA 300 CAACAGCCTT ACTGGCCAGCAGTCTACGAG TAGCCGGCGA TACCAATGAC 350 GAGAGCCACG ATATCACAGC GTACTTCTA 379355 base pairs nucleic acid single linear NO YES DNA (other) 150GATCCAACAA GCGGCTGGCG CCATAGCCGC CGCGAACCGG CATGACGATT 50 GTATCCGGCGACGTTAGCGA GGCCAGCGAA TTAACATCGG CCAGCCGTTC 100 CGCGTCCGTA CCGGCAAAACGCTGAAAGGG CGACGAATCA CCTCGTCATT 150 CTCCACCTGA TGACCCGCGT CAGTCAGGCGCTGAACGCCG CGTAACGGCT 200 GTTGGTTAAT ACAGTAGCCC GACTGGGCGA TTAATGAAACAGAGACATGG 250 TAATTCCTTG CTGACAATAG AATCGAATGT ATATCATGCG CATATATAGG300 CGATGTCTCG TGTCGCAGTT CTGATCGGAC AGGAGGCACT AGCTCGGGGT 350 ACTTT 355278 base pairs nucleic acid single linear NO YES DNA (other) 151GATCCTTATT CCCGATGTGT TCACCTTTAA TATTCTCCAC TCGCGCGTGG 50 AGGAGATGAGCGGCGTTCCG GTCGTTCCGC TATATGACAC GCCGCTATCA 100 GGGATTAACC GTCTGCTTAAACGGGCAGAA GATATCGTGC TGGCGTCGCT 150 GATTCTGCTG CTCATCTCAC CGGTACTGTGCTGCATTGCG CTGGCGGTCA 200 ATTGAGCTCG CCGGGCCGTG ATTTGCCGCA GACGCTACGGATGGCAGGCA 250 AGCGATCAAG CTGAAGTCGT CATAGGAG 278 394 base pairs nucleicacid single linear NO YES DNA (other) 152 GATCAAAATA AAACTTTAATCCCACTGGGG CAAGAGAGTG ATGTGGTGAC 50 GCTCAGTCCG GGTCAGGCGT CGGCGCATCTGCAATTTTAC GCGCGTTATC 100 TTGCCGATGG CGGCGCGGTA ACGCCGGGGA CGCCAATGCCTCCGCAACCT 150 TCATTCTTGC CTATGAATAA GTTCTTTTTA CGCTGCGCGC ATATATTGGT200 GCTTGCTTCC CATATCATGG GCGCAGGCTG GCGTGGTAAT TGGCGGTACT 250CGCTTTATCT ATCATGCGGG CGCCCGGCAT TAAGCGTACC GGTAAGTAAC 300 CGTTCAGAAGTCGTTCTGTT AATTGATACG CATATTTACT GGTGGGTCGG 350 TTACGGAACA AAACGATGGATATAGTCCTG TGTAGTGATA TGCT 394 324 base pairs nucleic acid single linearNO YES DNA (other) 153 GATCGTTAGC AAGGTTTGCT GCGTCATCTG CTGGGTTTCACGCAATGTGT 50 GCGCGTTAAG CATCACAAAA TGGCTGGCGC GCGTCGCCCA GTGGGCATTG 100ATTTGTAATT CAAGCATACA AACCAGGTTG CGGTTGATGG TCTGAATGGC 150 CTCGAAAATAGATTTTTGTA TCCGGGTTTC TTTACTGGCA GGCGTTATCA 200 GCCCGCGCAT TTTGACGACATCGTTCAGCA ACCGTTGCAA ATGTTATCCA 250 ACCGGGGAGT CAGCAATCGC GACAGCTGCCTTGATACCCA GTTACCTGAC 300 CGATCCGGAT GATCCGATCG GAAA 324 308 base pairsnucleic acid single linear NO YES DNA (other) 154 GATGGCTGGG AAGACGGGTGCCGTTCTGGT TAAGCGTATT CAGCTCTTCG 50 CGCGGGAAAT AGCCTTTAAT CGCCAGGGTACTGTACAACG CGGGGCCCGC 100 ATGGCCTTTC GACAGTACGA AGTAATCGCG TTCCGGCCAGTCCGGGTCGG 150 AGGGTCGATT TTCATCACCG CGCCGTACAG AACCGCCAGA GTCTCCACTA200 CCGACATGCT GCCGCCATAG TGACCAAAAG CCAAAGATGG TTTAAGGATT 250TGACGGTGGA CCGAATATCG ACAGTTGGGT GATTTCGGTT ACGTTCATTC 300 TTCCTGAA 308333 base pairs nucleic acid single linear NO YES DNA (other) 155GATCGTGGTC CAGCTTATGA ACGGTATAAC TGAGGGCGGA CGGCGTTTTA 50 AATAATTTTGCCGACGCCGC CGCGAACGTG CCTTCTTTTT CTAACGCATC 100 AAGAATAATC AGAACGTCCAGCAGTGGTTT CATACTCGTC CCCTTGCCGC 150 TATATGGCGA CCACCTGCTG GACAGCGACTCACTCCATCG GCATCACCAA 200 CGGATCGGGA TATTGATATT CAAATCCCAG CTCATTACAAATCGGCTACC 250 GTCGATAATC TTCCCTTTTG CCGTTGTCGG TGGTACGAAA ATCGCGGCGG300 CGATTCCCAG CAAGCGTATT GCGATAAACA CTG 333 334 base pairs nucleic acidsingle linear NO YES DNA (other) 156 GATCCACCCA CGTCATCAGT TGTTCAAAACCCTGCTTCAC GGTGTGTTCC 50 CATGGACCGA CCATGTGGAA AGCGGCTATC TTGCGTTTTTGTGGCTGCCT 100 GATTTCGTAA TCCATGCTGC CTCCGTCACT TCACAATGCT GTATGAATGT150 ACAGTATAAT TACAGCCTTT TACGGTCACA AGGACAGCGT GATCATTTTG 200TGAGCAACCT CGCAATCCCG CCCTTTTGAC ACCTCAGATG ACGGTGAACG 250 GTGTGTGTGACAACGGCTTA CGCTTTATGT GAAAATAGTC GTCAGACGAG 300 AGAACATACC GCCTTTACCACGATTCAGAG TGAC 334 152 base pairs nucleic acid single linear NO YES DNA(other) 157 CGTTTGCTAT CGACCTGCAG ATCGGAACGG ATTGGCGTCA CGTGATGGAT 50AAGACCGTGT TCTTCAATGT TATCTCGGCG ACACGAGCGC ATCCGGCGAA 100 ATATCGACCGCATCAACCTC TGCGTCGGGA AAGCATAACA CAGGCATGGC 150 AT 152 204 base pairsnucleic acid single linear NO YES DNA (other) 158 GATCGAACGC GCGTTGCAGCAGCGCCCGGC TATTTTCTAC CCGTGTCGTA 50 TCGCCGAAGT TGTGCCATAA CCCCAGCGAAATAGCGGGAA GTTTGACGCC 100 GCTGCGTCCG CAGCACGATA CTCCATTGTG TGATAACGATTCTCATCGGG 150 CTGATAAATC ATGACCTTTC CCCTGTGGCG AGAATAATAT GTGTACGGTT200 ACTC 204 283 base pairs nucleic acid single linear NO YES DNA(other) 159 GATCTTACCG AGTGGGAAAC TAATCCGCAA TCGACCCGCT ATCTGACGTT 50TCTCAAAGGT CGGGTAGGGC GCAAGGTCCG CTGACTTCTT TATGGATTTC 100 CTCGGCGCCACGGAAGGGTT GAACGCCAAA GCGCAGAATC GCGGCCTGTT 150 GCAGGCAGTG GATGATTTCACCGCAGAAGC GCAGTTGGAT AAAGCGGAAC 200 GTCAGAACGT GCGCCACGAG GTGTACAGCTACTGCAATGA GCAATTACAG 250 AGGGAGAATG AGCTGGATCG CTGTCTAAGA GCT 283 302base pairs nucleic acid single linear NO YES DNA (other) 160 GATCGCGTTCGCCAGGCAAA ATATTACCGT GCTCAAGAAT ACCGCTGCGC 50 ACGGCATCCT TTACCGTCTGGGCGAATTTC ATGTATAGCG GCGTATTATC 100 CGCCGCTGAA ATTCGTTCAT TCAGTTGCGCGATGAGCCGG GTATGCGCTT 150 GTTCCATTTA TCTTTCCTGA CGACGGGTCT GTAGGCAGTATACTACCACC 200 ACGCGTGGAA ATGATGTACC GGACCAATGC CCTTCCCCAC TTCCAGCCGT250 GTACGCTGGC AGCGCCGAAG CATGCCTTGC TCGTTTACCG TCTCTCCCAA 300 CT 302233 base pairs nucleic acid single linear NO YES DNA (other) 161GATCCTGAAT GAAAATCTCA CTGCTCGGCT TGTTGGTCAG TTCGGCCATG 50 GTCTGGCGCACGTGCTCCAG CATGCCGCCG ATATTGGTCC CGGCCTCGCC 100 GTGACGTTGT CGAGCTTGCCGCAACCGTCC ACCGCTTTGC TGATGGCTTC 150 GGACGCCGGC GGCAACATCC ACACAGCGCACCGAGACCCT GAGCCTGACG 200 CTACCGGATC CGGCGGTATG AGCGGTTAGC GAG 233 236base pairs nucleic acid single linear NO YES DNA (other) 162 GATCTGTTCCGTCTGACGGC GGGTAAACTG ACCGGCCTGG ACCGAATGGG 50 GCCAAAGTCC GCGCAAAATGTTGTTAACGC GCTGGAAAAA TCCAAAACGA 100 CGACCTTTGC GCGTTTTCTC TATGCGCTGGGCATCCGTGA AGTGGGTGAA 150 GTGACGGCGG CGGGGCTGGC GGCTTATTTC GGTACGCTGGAGGCGCTGCA 200 GGCCTCCGAC CATTGACGAG TTCGAGAAGT ACTACT 236 334 basepairs nucleic acid single linear NO YES DNA (other) 163 GATCGCGTGTCGGTGCGTGA TTTAAGCCGT GGCTTAATCG TGGATTCCGG 50 TAACGATGCC TGTGTGGCGCTGGCGGATTA TATCGCGGGC GGGCAGCCGC 100 AGTTTGTGGC GATGATGAAC AGCTATGTGAAAAAACTCAA TTTACAGGAT 150 ACCCATTTTG AAACCGTCCA CGGTCTTGGA TGCGCCGGGACAACATAGCT 200 CCGCGTATGA CCTGGCGTAC TCTACGGCGA TTATTCACCG GCCGAAGCCT250 TGAATTTATC ACATGTACAC GAGAAAAGCC TTGACCTTGA ACCGATTAGA 300GCAGAACCGA ACGCTTGATG GATAGACACG AATG 334 308 base pairs nucleic acidsingle linear NO YES DNA (other) 164 GATCGTAGTG GAGAGTGTCG CCGAACGTCTGGTGCAGCAA ATGCAAACCT 50 TCGGCGCGCT GCTGTTAAGC CCTGCCGATA CCGACAAACTCCGCGCCGTC 100 TGCCTGCCTG AAGGCCAGGC GAATAAAAAA CTGGTCGGCA AGAGCCCATC150 GGCCATGCTG GAAGCCGCCG GGATCGTCTG TCCCTGCAAA AGCGCCGCGT 200CTGCTGATTG CGCTGGTTAA CGTCTGACGA TCCGTGGGTA CCAGCGAACA 250 GTTGATTGCCGATGCTGCCA GTGTAAAGTC AGCGATTCGA TAGTGTGTGG 300 CGCCTGAG 308 362 basepairs nucleic acid single linear NO YES DNA (other) 165 GATCCCATCGCGAATATCGG TAAAACAGCG CTTCTGCTGA CCGCCGTCGA 50 TAAGCTTGAT CGGCGTTCCTTCTACCAGGT TCAGAATCAA CTGCGTTATC 100 GCGCGTGAAC TGCCGATACG CGCCGCGTTCAGGCTATCCA GCCGCGGCCC 150 CATCCAGTTA AAGGGACGGA AAAGCGTGAA GCCAATCCCTCTTTTTGCCA 200 TAAGCCCAAA TCACCCGTCG AGAAGCTGTT TGGAAACGGA GTAAATCAGG250 GCTTATTCAC CGGCCCGACG ATCAGATTGA TTGTGTTGTA AAGAGGCTCT 300AATCGGTCAC ATTAGAGAGA GGAAACATTT AGTATTAGAT AAGATACCGA 350 GTTTAATAGT AA362 71 base pairs nucleic acid single linear NO YES DNA (other) 166ATCGCGTTGT GTTGCCGAGC ATTTATTACA AGGCGCTTCT GTGTGNCNCT 50 CGAATGGTGCNGCAAGACTG C 71 363 base pairs nucleic acid single linear NO YES DNA(other) 167 GATCGTGTCG CAATTCTTAA TGCCATAGAG GGTAATCATA TTGAATCCTT 50TAACGCGAAA TTCGAATAAA TAATCAATAG TATCGTCTGC GGGATAATAA 100 GTGTGGCCGTTTATGGTTAT TTATCCAGCG CTGATCGGCA ATCAATATAA 150 CATTGTTGAG TGAATGTGAATAATGATTCC TTTTCGTTCC AGATGTGGCT 200 TGTTTATACT TCGCCGGTAT AATCCTATTTGGGCAAATGC AATTGTGTTT 250 ACCATTGATA AGGTAGGTAG GAAAGGTATA TGTGCTAATATGGCGTAGTC 300 ACATAATTAG TCTACGGCCA TGATCAGACG CAACAGGATC GACTCGTATG350 ACTTTACGAC CGC 363 329 base pairs nucleic acid single linear NO YESDNA (other) 168 GATCCGGCGC TGATTTTCAC CATCACGTTT TTCATCGGCT GACCTGCGGC50 GTCTTTCACG TCGATGGTGG CGGCCATCTG CTCGCCCTTC TTCGCCTTTG 100 CGCTTCCGGTGGTTTCATCC TGGCCTGCCA GCGTCAGCTC AGGCTGGCGG 150 CGGCGCTGCG GGCGAGGCAAGACAGGTCTG CATGTAGTAC ATCGAGGTGC 200 TGGTCGTCGT TTGACATCAT TGCCGTCGTTAAACAGGTTG ACCGCCGCAT 250 AGAGCGACTT GTGCCGTCTG ACGATATCAC GTAATCCCGCCACAGTAGCG 300 CTGAGCTGTG TGCTGACTGT ATGCACTAG 329 198 base pairsnucleic acid single linear NO YES DNA (other) 169 GATCTGGCGG GCGCGTGAAAATATGTTGCT GGCCTCCTGT ATGGCGGGAA 50 TGGCCTTTTC CAGCGCCGGT CTGGGGCTGTGTCATGCGAT GGCACACCAG 100 CCTGGGGGGC GCTGCATATT CCGACGGCCA GGCCAACCGATCGTCGTCGC 150 AACAGTCATG GGCTTTAACG GATCAGTTTA CGGAAAGTTC AGTAATAT 198273 base pairs nucleic acid single linear NO YES DNA (other) 170GATCAACATC AATAACTAAA ACTCTTTTAC CAAGATAGTT AGCCATGAAC 50 TCAGCAATGCCAACACATAG AGTTGTTTTT CCTACCCCGC CTTTCATATT 100 AATAAAGCTA ATTACCGATGCTGGCATAAT TATTCCTTGC TATGTTGAGA 150 ATGAGTCATT TTGATAATTA CTCGAGCTTTTATCTTAATC TTCGCGCGTT 200 CGAATCCTTC CCTTCATGTA CTTCTCGTAC ATGGCATCCAGTTCCTTGAG 250 ACGAGATAAT ACCCGAAGAA AAT 273 244 base pairs nucleic acidsingle linear NO YES DNA (other) 171 GATCGCTGGT TCTGGCGGCA CCCTGGCGCCAACCCAAGCA ACGTCGCGCG 50 CGCGGCATGG CAGGATCTTA CCGCCGGGCG CGTTATTATTTCCGGCGGCA 100 GTACGCTGAC TATGCAGGTG GCGAGACTGC TGGACCCCGC ATTCGCGCAC150 GTTCGGCGGT AAAATCCGCC AGCTTTGGAG CCCTCCAGCT TGAATGGCAT 200TTGTCCAAGC GCGATATCCT GACGCGTGTA CTGAACCGAG AGTG 244 247 base pairsnucleic acid single linear NO YES DNA (other) 172 GATCGCGCAG CGCTCTCATAGCACAAAACG AGGTTTTCCA TTCTGTTATG 50 TTCCCTGGCG ACGATAAACG TTCGATTGTCTCATGGCGCT GGTGAACCTT 100 ATTTTTTAAC GGAGATGTTG AATGGCGGTA GAGGTTGTACGTAATGGCCA 150 AACCCGGCGG CGGATCTCGA ATATTGATTC GGCAATATTC GTTCTATCTT200 GGAAAAGGAG CGCTGTACCG GAACGGAATA AAACTGCGAT GTGCAGA 247 300 basepairs nucleic acid single linear NO YES DNA (other) 173 GATCAGCTTGCCGCACTGTA TGCCTCCAGC GACGGCAATA AAATCCACAC 50 CGTATCCGGC TGGCCGACTGAGTATGACTA CTGGTCATCC ACCTTCGCCA 100 GCGCCGCTAC ATGGCAGGCG GTATCACTGGCTGCGGGCGG CTATACCGCT 150 TCCGGCGATG CGGTCGGACT ACGTGAGCTG TCTGGTCAGCAAAAATCGAC 200 GCGCGTCTAT CACCATTGAG CCGGTGGATG CGCATTGTGT ATACGCAACA250 GCGAACACGC GTGAAGGTGA AAGGCATACG TCAGCTTAAG TGACGTAAGA 300 337 basepairs nucleic acid single linear NO YES DNA (other) 174 GATCCGGACCGTGCCTTATA CCCTGAAAAA GGGGGAGACG GTGGCGCAGG 50 CGCACGGCCT GACCGTCCCACAGCTGAAAA AACTGAACGG GCTCCGCACT 100 TTCGCCCGCG GCTTTGACCA CCTGCAGGCCGGCGACGAGC TTGACGTTGC 150 CGGCGGTCCC GCTGACCGGC GGGAAAGGTG ACAATAACCGCCATGACGTC 200 CGCGGTCCGT TTGCTGCTGA CCGGGAAAAT GAGGACGATC GCAGGCAGCA250 GATGGCCGGC ATGGCTCACA GGCGGCAGCT TCTGCCAGCC ATCGGACGTT 300AGGCCGCCGC GGATGGTTCG TATTCGCGTT GACATGT 337 424 base pairs nucleic acidsingle linear NO YES DNA (other) 175 GATCAATGAA GCTTTGTGGG AAGTCTTGACTTTCGTCGAT AAATACGTAA 50 TCAAGTGCCT TTTTATCAGC TCTCCCACTA TTATTTATATCTGCAATGGC 100 TTTCTTACAT AGGGCATCAA AATCGCCATT ACCAAATCCC CCAAATGGAA150 TTTCGCTAAT AATGGCATAT ATATCTGGTA CATTCCAGAA AAAGGTTCTT 200TACGTCAAAC CCCAAGAGTT GAAGCAAAAA AGTTTTTGTA CCCCATTCTA 250 TCTGTTTTTCGACTCGCATA AATCGAAAAA CTCAGGGATT CTGGTTCTCA 300 TTGTGGAGCA GATTATAAGCAGTAATGCAT CTAGATACGG TTTGATACTC 350 TCTAGTGTAG TATCAGTTAC TGACAGCTACTGCATAACCC TTTCAGCACT 400 GAGACACGTG CGCAAATGTG TAAA 424 190 base pairsnucleic acid single linear NO YES DNA (other) 176 GATCATTTGA TTAAAACCTCACACCGCAAG ATGCGACTTT TTGTAAACCT 50 GCTTTACCGC TGACACATTT CTCCGCATTACTGCGGAACA AGGCTTAAAA 100 AGCGTATCCG AACGTATAAC CCTCCAACGT TCGCTACGGGAAAAATGGGG 150 ATGAGTACTG GAAGGTCGCA TATATGACCA AGCCAGACAT 190 441 basepairs nucleic acid single linear NO YES DNA (other) 177 GATCCATGCCTGTGATGCCT GGATGTCCCG AATACTTGAA GGTTTGATCG 50 AACGGCAGGC CAGTAATGGCAACGCCACTA TTCTGTTATC TGCGACGCTA 100 TCGCAGCAGC AGCGAGATAA GCTGGTGGCGGCATTTTCCC GTGGGGTGAG 150 GCGTAGTGTG CAGGCGCGTT GCTAGGCATG ACGATTATCCCTGGCTGACT 200 CAGGTCACAC AAACAGAGCT GATTTCTCAG CGGGTTGATA CACGCAAAGA250 GGTTGAGCGT TCGGTAGATA TTGGCTGGCT ACATAGTGAA GAGGCGTGTC 300TGAACGTATA GTGAGCAGTG AAAGAACTGT ATCGCTGATA CGTACTCGTG 350 ATGATCGATCGATCTACCGA GCTACTCACT GGTAGGGCAG AACTTACTCA 400 AGGCTCTCAG GCGTCTAACAGGCGTCTAAC ACGTGGAAGT T 441 370 base pairs nucleic acid single linear NOYES DNA (other) 178 GATCGTCGTT ACCGGCGACG GTTAAAGCAA ACTGGGCATCAATGGGCCGT 50 AAGAGTTTTT GTTCAACGGC CTCCAGCAAC CGCTCCTGGA TTGTCATTGC 100GCCTCCTCAC TCATTTCACC TGCAAACATA TCATCCAGTT GGTTAATTAA 150 CGCCGCCGCAGGACGAGTGG TAAAAATACC CTGCTGCGGA CTGTCGCCAT 200 CCACCCCGCG TAAAAAGAGATAGATGACTG CCGCCGAAAT GGCGTTCATA 250 GTCGTAATTC GTCATTCGAT GACGAAGGTAACGGTGCAAT GCCAGCGTAT 300 AAAGCTGGTA CTGCAAATAT AGCGATCGCG TGCTCCGCGCAGCCATGCGT 350 CTGGATAGCG CTATCTGCCG 370 212 base pairs nucleic acidsingle linear NO YES DNA (other) 179 GATCCGGGTA CTATGAGCCC AATCCAACACGGGGAAGTGT TCGTTACTGA 50 AGACGGCGCT GAAACCGACC TGGACCTGGG GCACTACGAGCGTTTCATCC 100 GACCAAGATG TCTCGCCGCA ACAACTTCAC GACTGGCCGC ATCTACTCGA150 CGTTTCTGCG TAAAGAACGG TGACTATCTG GGACGACAGT ATCTAATATA 200CGGATTAAGA GG 212 367 base pairs nucleic acid single linear NO YES DNA(other) 180 GATCTTCTTC ACGTCTGGCT TCATCACTCT GATGAACGAT ATGCTCGGTC 50AGATGACCTT TAATCACCTC GCGCATTAAG CCATTTACCG CGCCGCGAAT 100 CGCCGCGATCTGTTGTAACA CGGCCGCGCA TTCATGCGGT TCATCCAGCA 150 TTTTTTTTAG CCGCTATCACCTGTCCCTGA ATCTTGCTGG TTCTGGCTTT 200 AAGCTTTTGT TTGTCCCGGA TGGTATGTGACATTACAACA CCTCACTAAA 250 CATTAACGAA TACAAATTAT AGCATTACCA GATGCTACTGGGGGGTAGTA 300 TCTATACTGG GGGGAGTAGA ATCGACGCCC ACATAAAACA ACTAAGAATC350 ACTCATGGGT GAATTTC 367 196 base pairs nucleic acid single linear NOYES DNA (other) 181 GTATCACGTT TGATGCGGCT GTTATCGTCC AGATAGCCGGTGCGATAGGC 50 AAAATAATGC GGCAATGAAA GCGCCAATCG CCAGGGGGGA TCCCCACAAT 100ATATGCCAGC ACGACCCCGG GGAATACCGC ATGACTCATT GCATCGCATT 150 CGCGCTTTTACACTAAAACC CGCGTAGGAG ATCGCAATCG GACTAG 196 266 base pairs nucleic acidsingle linear NO YES DNA (other) 182 GATCTGTCGC GTTTTCGCCA GAATAGCGCGCGGAATAGAT ACCCGGCGCG 50 CCGCCTAAAA CGTCAACGGC CAGACCGGAG TCATCGGCAATGGCGGGCAG 100 GCCGGTCATT TTGGCGGCAT GGCGCGCTTT GAGAATCGCG TTTTCAATAA150 ACGTCAGGCC GGTTTCTTCC GCGGAATCGA CGCCCAGTTC CGTTTGCGCT 200ACCACATCAA GCCAAAATCG CTTAACAGCG AGCNNCACTT ACGCGTNTGC 250 GAGACACTTTNCTGAG 266 351 base pairs nucleic acid single linear NO YES DNA (other)183 GATCATCATC ATTCCGCAGC CAAACGCGCG GCTTTTACCG AACCCCTGCG 50 CCAGACGTTGCAGGAAAAGC GCGGGTTCGT TAATCACCAG CACGCCGGTA 100 TAGTCCACGC TGCTAAACTGAATCATCTGG CCGATCTTTT CCCGCGACGT 150 ATCTGCCTGC CTGCCGATAA GCATCAACGCTCGGCTCGGC AGAGTAAAGC 200 CATTTTGCCT CCCCCTGCGC GCCAACCACG CAGGCGCTGCTGCTGATAAG 250 ACCAAATATG CTGGCTATCA CCTGCGTTTA GTGGCGATTT AGACTCATCA300 GCAAATCGTG AGTTGCGTTT TGCAACGAGA TTGGGAGGTT AACGAGATGA 350 A 351 398base pairs nucleic acid single linear NO YES DNA (other) 184 GATCATGTGGTGATCTGCGC CGGACAGGAA CCTCGCCGCG AGCTGGCGGA 50 CCCGTTACGC GCCGCAGGTAAAACGGTACA TCTTATCGGC GGATGCGATG 100 TCGCGATGGA GCTGGATGCC CGACGGCGATTGCCAGGGCA CCCGACTGGC 150 ACTGGAGATT TAACGACTTT GCCTGATGGC GCTACGCTTATCGGGCTTAC 200 GCCGTCATAC CGGTTTTATA GGCCGGTATG ACGCTTGAGC GCTTATCGAC250 GGCGTCCTGC TTCACCGCTT TCAAAATGAC AAATTTATTG TTGGTGCTAT 300CGTCGCGCAA TTACCGAAAT CTTCTTCAGC TGTGGAAATA GTCAGATGGC 350 GTTCGCACATATACAGTTGC CGTGATTAGC ACACGCTATG CAATTCAG 398 347 base pairs nucleicacid single linear NO YES DNA (other) 185 GATCGCTATT GGTATGGCCCCACTTGCCGT ATTTCACCGG AAGCGCCGGT 50 GCCCGTGGTT AAGGTAAATA CCGTTGAGGAACGCCCGGGC GGCGCGGCGA 100 ACGTGGCGAT GAACATTGCG TGCTCTGGGA GCGAACGCCGTCTGGTCGGC 150 CTGACGGGTT ATTGATGACG CCGCGGCGCC TGAGCAAAAC GCTGGCGGAG200 GTCAATGTGA AGTGCCGACT TCGTTTCTGT GCCGACGCAT CCGACGATTA 250CCAAACTGCG AGTACTATCT ACGTAATCAG CAGCTCATTC GTTTGATTTG 300 AAGAAGGCTTTGAGGATGAC CGCAAGCCGT TGCATGAGCT ATAACCA 347 294 base pairs nucleic acidsingle linear NO YES DNA (other) 186 GATCGGCGTG CTGGCGGCGA CCTGGCCGCGGGAAATACCC TGGAAGAGGC 50 GTGTTATTTC GCCAATGCGG CGGCGGGCGT AGTGGTAGGTAAACTCGGGA 100 CGTCAACGGT TTCCCCTATT GAGCTGGAAA ACGCAGTGCG CGGACGGATA150 CCGGCTTCGG CGTTATGACC GAAGAGGAGT TGAGACAGGC CGTCGCCAGC 200GCGTAAGTCG CGAGAAGTGT CATGACCAAC GCGTTCGATA TCTGACGGCA 250 TTATGACGCAACTGGACCTA TCGGATACTT ACTAGACTAC ATAC 294 352 base pairs nucleic acidsingle linear NO YES DNA (other) 187 GATCCGCATT GTCAGGGATA TCGCCCTGAACGCGAGCTAC GCCGGCATCT 50 GCTGCTGATT ATTGCCATTG ATCACCGCCA GCTTAACGGCCCGTCGCCCT 100 GGAGCTGTAC CGTAATGTCA CCAGCAAACT TCAGCGTCGC GTCAGTAGGC150 TAGTGGCGAC CAGCAGTTCG GCAGTACGTT TTCACCGGCT GCGGATAGTT 200ATGATTGTCG AGGATCTGTT GCAAGGTTTC CGAAACAGTT ACCAGCTCGC 250 CGCGAACACAAAGTTTTCAA ACAGATAACG ATGTAATTGG TCATGTTGCG 300 CATAATCATC TCTCTTCAGTACATTATTCA CTATACGTGT TTAAATCGTA 350 CA 352 290 base pairs nucleic acidsingle linear NO YES DNA (other) 188 GATCCTTACC GTTTTGGTCC ATTAATACAGGAAATGGATG CCTGGCTATT 50 GACGGAAGGC ACCCACCTGC GTCCTTATGA AACGCTGGGCGCGCACGCCG 100 ATACGATGGA TGGCGTCACC GGCACCCGTT TCTCCGTCTG GGCGCCTAAT150 GCTCGTCGCG TTTCGGTTGT CGGGCAATTC AACTATTGGG ATGCGCCGTC 200GCACCCGTAT GCGTCTGCGC AAAGAGAGCG TATTTGGGAG CTGTTATCCC 250 GGCATAATGGACACTGATAA TCGAGCTCGT ATCGCAAGAA 290 213 base pairs nucleic acid singlelinear NO YES DNA (other) 189 GATCTTCAGC AACCACGACA GGAATGCCCGTCTCTTCCAT TAACAGACGG 50 TCAAGGTTAC GCAGCAGGCG CCGCCCCGGT GAGCACCATACCGCGCTCGG 100 AGATGTCTGA CGCAGCTCCG GCGGACACTG TTCCGGCGCA CCATTACCGC150 GCTGACGATA CCGGTCAACG GTTCCTTGCA ACGTTCCAGA ATCTCGTTTG 200CGTTCAGGGT AAA 213 256 base pairs nucleic acid single linear NO YES DNA(other) 190 GATCGCTTTG GTTAAATCCC CGCCGCCAGT GTCGGCGCGA CCAGAGCGGA 50ACGTGACGAT TCTGTCGGGA AGCTGCAAGC CAGTGCTGCG GCGGCCATGA 100 GGACTTCCTGCAACAGTAGA CGCGCCAGTG CGGCGGCAAT TTCGCTGCGG 150 CGGGTAAATT TAAGCTGATGCACCAGTAAA CTCAAGGCGG TGTATAGTCA 200 CTGACGCTCA CCAGACTTGC AGGGTGGCGGTTTTTTCAGG CAGCGACCGC 250 ATGGGG 256 247 base pairs nucleic acid singlelinear NO YES DNA (other) 191 GATCGTGGCT GCCGGTGCTG TCGGTGTAGCCACCACATTG ACGGCGGTCT 50 TGGGATACTC TTTCAGCACC ATCGCCACGG CGGTCAGCGTCTTAGCGCCT 100 GCCGGCTTTC AGCGTCGGCT GCTGCTGTCG AAGGTGACAT TATTCGGCAT150 ATTAGAATGA CTACTTACTC GCCCGCCTTC GGCTCACGCT AACGCCTGTG 200CCCCGATTTG TAGAGTTTGC TTCTGTACGT AGAGTAACCA GCGCGCA 247 402 base pairsnucleic acid single linear NO YES DNA (other) 192 GATCCATTTT AACTTTAGCGGCCCTTTTGG CGAGGAGATG ACTCAGCAAC 50 TGGTCGGGCT GGCGGAGTCT ATCAATGAGGAGCCGGGCTT CATCTGGAAA 100 ATCTGGACAG AAAGCGAGAA AAACCAGCAA GCTGGCGGTATTTACTTGTT 150 TGAATCCGAA GAAACGGCGC AGGCTTATAT TAAAAAACAC ACTGCGCGTC200 TTCGAAAAAT CTTGGCGTTG ATGAGGTGAC GTTTACATTA TTTGGCGTGA 250ACGACGCGCT GACGAAAATA AATCACGGCA ACCTTTGCCG CTAAATCACA 300 TAACGCAGGTTCTGTTCCGG TGCTGCTGAC CGCAACGGTA ATCTTTATAC 350 CGGGCGAGTA CCTAAGAGGCTTTATGGACG ACAGCGACAC GACGTTTCAG 400 CG 402 240 base pairs nucleic acidsingle linear NO YES DNA (other) 193 GATCGCGAAG CCGCACAACG TAAGCAGGGGTTATGTAGTG TGTTCTTCAA 50 CACCACGCTA TTCATGCCGT ACCGCAGGTA GATGTCCCCCTTAGGAGCAT 100 CGCTTACGCT GGGAACAGCG TTTAAGCAGC TTTTTGACAA GGGAGCTTTG150 ATGTATTGTT TGCAGTTCTA GACCTGACAC GGGCGATGAA TAGGAGCAAA 200GCGTGGTTTA CACATCCATA TTGCTATGTT ACACTATTAC 240 248 base pairs nucleicacid single linear NO YES DNA (other) 194 GATCCCCTCT ATACCGCAGACAACACAAGG CGCGCTTGCT AACGCGGTGT 50 TACAGGGCGA AATCTTTCTA CAGCGCGAGGGACATATCCA GCAACGGATG 100 GGCGGGATGA ATGCGCGCTC GAAAGTCGCA GGAATGTTAATGCGCCAGGA 150 TAACGCCCTC CGCTAAATTC TTGGTATTTT ATTTGGCTGG CCGACGTCGC200 AAATTAGCCA AAGTTAGCCA ACTTCTAGCT GATTCATCTA CGATAATT 248 304 basepairs nucleic acid single linear NO YES DNA (other) 195 GATCGGGGTTCAGCTCAAAT TTTTCAATCG CCCAGGCAAC ACCATCTTCA 50 AGGTTCGATT TAGTCACAAAGTTAGCCACC TCTTTGACCG ACGGAATGGC 100 GTTGTCCATT GCCACGCCCA TACCGGCGTATTCGATCATC GCAATGTCGT 150 TTTCCTTGAT CGCCATCACC TCCTCTGCTT AATACCCAGCGCCTCGACCA 200 GTGATTTACG CCAGTGCCTT TATTAACCGT TATCGAGGAT TCAAGGAAAT250 ACGACACTTA CGCACGGTAC TTCTCATTGC GAACGCATGC GCGAACGCAG 300 TCAT 304301 base pairs nucleic acid single linear NO YES DNA (other) 196GATCTGCGCC CCAGCGTTTG CAGCAGAAAA TAAAAGCCGA AAATCACCAC 50 TAAACAGGCGATCAACACGT AGAGAAGCAA CCTCCCAATC AATTTCATGG 100 TCTTCCATCC CGTGAAATGCACATAGGGGA TTTATGCACG ATTTGCGTGC 150 AATCCTCAAG ACAGGAATGG TGAAAGAGCGTTACAGCAGC GGCGAATCGT 200 GTCGCGCGCA GGGTTTTTAC GGTTTTTCGG CGGAGAATCAGTCAGCACGA 250 TAGCGTGATG CGCAGCGATC GATGAGAGCG ATTTACCATC GGACTGAGAT300 T 301 366 base pairs nucleic acid single linear NO YES DNA (other)197 GATCCAATCC TGAACGCCGA ATTTTCACCA CAGGGCGTTG CGCTACGCCA 50 GTTCACTACCCGCTGGGAAG GCGGTATGGT CAGAACTTCC GGCGCCTGGT 100 TACGCGAAGG CAAAGCGCTTATTCTGGACG ATACCGCTAT CGCCGGGCTG 150 GAGTATACGC TGCCGGAAAA CTGGAAGCAGTTATGGATGA AGCCGCTGCC 200 CGACTGGTTG AACAGCTGAC GCTGAAAAAT TCAGGCAGCGCAATCTGGTG 250 ATTGATATCG ACCCGGCCTT CCGTGCAAAT CACCGCTCTG ACGCTACGCG300 CAAACTGAGC TGTACAACCA TCATCAATGG GCTCTGAGCG CATCGACTAC 350GGCAGCGGAA CTTTAC 366 310 base pairs nucleic acid single linear NO YESDNA (other) 198 GATCGCTACC CAATTCCGCG CCCACACAGC CTGCTTTAAT CCATTGCGCT50 AGGTTTTCCG GCGTCACGCG CCGACGCAAA TAGCGGAACA TCCGGCGGAA 100 GTACCGCTTTCAGCGCGCTG ATGTAGCCCG GACCAAACGC CGACGACGGG 150 AAAATTTTTA ACTTCTGTGCTCTCTGCATC CAGCGCAGAA AAGGCTTCCG 200 TTGCCGTCGC GCAGCCGACA CACGTCATGCCATAGCTCAC CGCCGCGAAT 250 CACTCGGTTG ATATCGCGTA CATCACTTCG CCATCGCACGTGTTCTTCGT 300 TAGCTGTACA 310 348 base pairs nucleic acid single linearNO YES DNA (other) 199 TCGAAAATAC GTATACCCTG ACAGTGAAAG CAACCGATGTTGCAGGCAAC 50 ACGGCGACGG AAACGCTCAA TTTTATCATT GATACCACAT TGTGGACACC 100GACCATCACG CTGGATAGCG CAGATGATAG CGGCACCGCC AACGATAATA 150 AGACTAACGTTAAAACGCCC GGGTTTTATT ATCGGCGGTA TTGATTGATT 200 CTGACGTGAC TCAGGTCGTCGTGCAGGTGA TGCGCGATGG TCACAGCGAG 250 GAGGTGGAGC TGACCGAGAC TAACGGGCAGTGGCGTTTGT ACCGGCACGC 300 GTGGACTGAT AGGCGACTAT CGCGTACGTA GTGAAGATAGCGTATATA 348 279 base pairs nucleic acid single linear NO YES DNA(other) 200 GATCGGATAA CGACTCCGCG GTGGATGCGC AAATGTTGCT TGGCCTGATT 50TACGCCAACG GTGGGCATTG CCGCCGATGA TGAAAAAGCC GCCTGGTATT 100 TCAAACGCAGTTCCGCCATT TCCGTACCGG CTATCAGAAT ACTGCGGGAA 150 TGATGTTTTA AACGGTGGAACCGGGCTTTA TTGAAAAGAA TAAGCAGAAG 200 GTGTTGCACT GGTTGGATCT AGCTGTCTGGAGGTTTGATA CCGATACCGT 250 TGCAAGATTC GAACGCTACG ATGCTATTT 279 272 basepairs nucleic acid single linear NO YES DNA (other) 201 GATCGCCAGGGACGATGGCG AGCTGGGCCC CTTGTAAATC GTTTTTGGTG 50 AGGCCGAGAT GAAAAACATCAGACTTGGAC ATATAAAACT CCTCTGTGAA 100 TCGGGTTTGT CAGAAGAAGA AAGAGACACTTTACCTAAGG ATAAAGATAT 150 TTTGGTGCAT CATCACTATG CGTAAAACAA TTGCGTGTTCCATTAAAAAG 200 AGATGCCCCA TCACAATAAA TAATCAATAT GCAGGCATTG CACAAAGCAT250 AGGCGTTTAG GCATGTGTTG TA 272 401 base pairs nucleic acid singlelinear NO YES DNA (other) 202 GATCCAATAA TGACTGCATT GCCTCATACCCCATACGTAA CGCGCTATAC 50 AAAATATAGA TGCCGATACC TAACGCAAAC AGGGCATCCGCACGATGCCA 100 ACCGTACCAG GATAACCCCA GCGCGATAAG AATCGCTCCG TTCATCATAA150 CATCAGACTG ATAATGAAGC ATATCGCCCG TACCGCCTGA CTTTGGGTCT 200TGCGTACCAC CCAGCGCTGA AACGTGACCA GTATAATAGT GCATATCAGA 250 GCATGACGGTAACGCCAATC CCACGCGGGG TCGTTCATTG GCGTGGCTTT 300 AATCAGATTC TGAATACTGGTCAAAAACAG AAACACGCGA ACCGGAAATA 350 ACTACTTTGC GCGCGCAGGC ACTCGTTTACGTGCCAAGGG TTAATGGTGG 400 G 401 169 base pairs nucleic acid singlelinear NO YES DNA (other) 203 GATCCAAAGT CGTTAAATAA CGGCGGGAAAAGCCTCCACG CCATGGAAGT 50 GCCCCGGAAA TCGCCCCGAC CATGGTGGCG ACAGTATCAGTATCATTGCC 100 GATATTAACC GCCGGAGATA ATAGCATCTA CGGCAGAATT CGGACAACAC150 GCGAACAGGC CAAAGCGGC 169 253 base pairs nucleic acid single linearNO YES DNA (other) 204 GATCCAAAGT CGTTAAATAA TCGGCGGGAA AAGCCTCCACGCCATGGAAT 50 GCGCCGGAAA TCACCCCGAC CATGGTGGCG ACAGTATCAG TATCATTGCC 100GATATTAACG CCGGAGATAA TAGCATCTAC GGCAGAATTC GGACAACACG 150 CGAACAGGCCAAAGGGCCGG CACCGCTTCA CTCACGTGCA GCCGGAGCAA 200 TATATAGCAG TTCACACGCGTTCCATGGAT GAGCTTCGAT ATAGCTCAGT 250 ATG 253 198 base pairs nucleic acidsingle linear NO YES DNA (other) 205 GATCGTACAG ACCCGCGTTG TCATAACCACGGGTTTTTAG TTCCGCCACA 50 CGCTCGCCCG CCAGCGTTTT CATATCCTCT TTCGAGCCAAAATGAATGGC 100 GCCGGTTGGA CAGGTCTTCA CGTCAGGCCG GTTCTTGCCG ACGTGTCACG150 CGGTCAACGC ACAGCGTACA TTATGACGTC GTTGTCTTCC GGTTGAGG 198 411 basepairs nucleic acid single linear NO YES DNA (other) 206 GATCGGAATGCCTTTGAACA GCGGCAGGTC TTCCAGCGGC AGTCCGCCGG 50 TCACGGTCAC TTTAAAGCCCATATCGGACA GCCGCTTAAT CGCGGTAATA 100 TCCGCCTCGC CCCACGCCAC GGCTGCCGCCTGGGGTCACG GCTGCGGTGA 150 TAAACCACTT GCTGAATACC CGCATCACGC CACTGCTGCGCCTGTTCCCA 200 GGTCCAGTAA CCGGTCAGTT CGATCTGCAC GTCGCCGTTG AACTCTTTCG250 CCACATCCAG GGCTTTTGCG GTGTTGATAT CGCACAGCAA ATCACGGTAC 300CAGTACGGTT GGCTTCGAAA CACATACGGG AGAGGATTTA CGAATGCATT 350 GGGAGAGATTGGGTAGGTCA GTAGACGAGA ATGCAGAGAT GGCATGAAGA 400 TTGAAGGGTA G 411 402base pairs nucleic acid single linear NO YES DNA (other) 207 GATCCTGAGCCGGGTAGCCA GTATTTGCAG GCAGCAGAGG CAGGTGACAG 50 ACGCGCACAA TATTTTCTGGCCGACAGTTG GTTGAGCTAT GGCGATTTGA 100 ACAAAGCTGA ATACTGGGCG CAAAAAGCCGCCGACAGTGG CGACGCCGAC 150 GCCCTGCGCG CTACTGGCCG AAATCAAAAT CACTAATCCGGTAAGCCTGG 200 ATTATCCCGA CGCGAAAAAG CTGGCTGAAA AGGCGGCTAA CGCGGCAGTA250 AAGCGGGAGA AATTACGTGG CGCGGATCCT GGTCAACACC CAGGCCGGGC 300CGGACTACCA AAGCCATCTC GCTGCTGCAA AAGGCCTCTG AAGATCTGGA 350 TACGACTCGCGTGATCGCAA TGTGCTTGCT ATTGACTGGG CATCTCGTTA 400 AA 402 288 base pairsnucleic acid single linear NO YES DNA (other) 208 GATCAAACGC GCTGGCGTAATCGCTACTGG GTTGATAGCG AAGGCCAAAT 50 TCGCCAGACG GAACAGTATC TGGGCGCGAATTACTTTCCG GTGAAAACCA 100 CGATGATTAA GGCGGCAAAA TCATGATGAA AAGGACGATAAGCGCGCTGG 150 CGTGGCCTTT GTCGCGTCAT CCGCCTTTGC CAGCGGCACT GTTACCGTTT200 TTACCCAGGG TAATAGCGAG CTAAAACGCT GACAGACGCT GAGCGCTCGC 250TCGATTAGTG GACAGCGCGC TGCACGAGCT GGTGGCTG 288 169 base pairs nucleicacid single linear NO YES DNA (other) 209 GATCAGGGAA CCTGTACCTCTTAAAGAGAA GTTCGATACC CCCAACGGTC 50 TGGCGCAGTT CTTCACCTGC GACTGGGTAGCGCCTATCGA TAAACTCACC 100 GAAGAGTACC CGATGGTACT GTCGACGGTC CGAGTCGCCACTACTCTCCG 150 TCAATGACCG GTAACTGTC 169 311 base pairs nucleic acidsingle linear NO YES DNA (other) 210 GATCATCTTC GTCCTGCTCT TCCTGACTCAGCGCACTGTT TACGACAATA 50 CTGTCCGCAT CTCGTTGTGC GATTTTATCG GCGACGTCGCGGGAATAATC 100 GCATATTCAC ATTCACCGCT GTTATTGATA ACCAGACGGC AATCGCAGAC150 GCCCATTAAT CAGTTGCGTC TGAGTGAGCT TATCCACGTC TATTTTTTTG 200ATGACGTTAT TATCGGTGAA GTTAAAACCA ATATCGCCTT TAGATACATT 250 GATTCTATTCATTTCAATAA GTTGCTTAAC CTGAGCTTTA AACTCTTCGC 300 TAAAACCGCT G 311 368base pairs nucleic acid single linear NO YES DNA (other) 211 GATCAGTATCATCAGTAATG GCCAGCGTTG CAGTATTCTG AATAGCCAGT 50 GAGGTTTTCA GCGGGAAAATGGCGAGGGTA TACGGAACCG GTTCGGTGGT 100 GCCTTTTGTA GCAACGGTAA ACATTTCCATATTGCCGTTT TTGATAATCC 150 GGTGGAAGAC TTCTGCCAGA CTGGGCTATC AACGGTTCCTGAGATAGCGT 200 CAGATTTTAC ACCATCAGCG GTAACGTCGC GTATCGGTAT AAATAGAGAA250 CGCGCCGATT TTTACACCTT CGGTTGTTTG CCAACGCGAG ACATTGTGGA 300TCAGATACTA TACTATAGTC ATATCGCATG GCTATGAGAT ACGAGTGCCT 350 GGTGGTGTGCACGTATGA 368 258 base pairs nucleic acid single linear NO YES DNA(other) 212 GATCATCCAC TCATCTTTGC CGGTTGAGCC CGATAGTTAC CCGTTCAATA 50CCGGCATCAA TCGCCCCCGT TTTATTCACC ACCCCCAGAA AGCCGCCGAT 100 AATCAAGACAAACAGCCGCG ACGTCAATGG CGCCGGCGGT GTAGGTTTCT 150 GGGTTATAGA GGCCGTCAATCGGCGCCAGC AAAACAGCGG TAATCCTTTC 200 CGGATGCGCA CGGGGCATAC GCTCCGCACCGACTTTCAGA GCTGCTATCG 250 ATTGATTT 258 322 base pairs nucleic acidsingle linear NO YES DNA (other) 213 GATCATTGTC ACGCCATTTT TTTAAATTATTAGTATGGCG TGTGGAGACG 50 CGTATCTGCT CACCAATATA CGTATTGTCC ATAGGCGTAGACAAGCTCCA 100 TTGCTACAAA GATAATTTTA TTTAAGTGTC AGGAAAATTC CGGACAAATC150 CCTTTTTTAA TAAAAATACA CACTCTCGGC ATGGGATAAT ACTTAATTAA 200CTTTTGTTAG CGTTTTGAAA TTAAAAACAG CGCAGAGGTA ATAATAGAAA 250 ATAACGTTAACAGGCTGGGT GAGTATATTT GACTGACACA ATTCCAGGTG 300 TATATGTATG CGTTTATGCA TG322 320 base pairs nucleic acid single linear NO YES DNA (other) 214GATCATCCGC AGAAGAAAAA ATATGGCCGC GTAGAGATGG TGGGGCCGTT 50 CTCCGTTCGCGACGGAGAGG ATAATTACCA GCTTTACTTG ATTCGACCGG 100 CCAGCAGTTC GCAATCCGATTTTATTAATC TGCTGTTTGA CCGCCCGCTT 150 CTGTTGCTCA TTGTCACGAT GCTGGTCAGTTCGGCGCTCT TGCCTATGGC 200 TGGCATGGAG TCTGGCGAAA CCGGCGCGTA AGTTGAAAAACGGGCTGATG 250 AAGTGGCGCA AGGCAACCTG CGTCAGATCC GGAGTGGAGG GGAGAGTTCT300 GGTGCAGTTT AACAGATCTA 320 277 base pairs nucleic acid single linearNO YES DNA (other) 215 GATCAGATGG ACCACAACGA GCACCGAAAA CAAAACGGCGCTGACCATCA 50 GAATGACGGT AGTGCCGAGT TTCATGGGGC GTTTGCGTAA CGCCGGCATG 100GCAGGGAGTG TTTCATAGTG GACCTGAGCG ACGAATCGTA AGGTTATTAT 150 CCCTGATGAGGCTCTAATTC AAAGGCATAG GCAGTCGTCC AGTGTGAAAG 200 CCGCTGCTGC AGGCCGCTACTGCATCGTAT ATCGGACGAG ATTTCAATCA 250 ATAACACGCA ATTTCCGCAT CCAACCG 277330 base pairs nucleic acid single linear NO YES DNA (other) 216GATCCTGAAA CGCTGACCAG ACGCCGAGCG CGCCGTACCA CGAATCTCCG 50 GTGGCACTCTGCGCACAACC TCTACGCCCA GCGATGGGAA CATCAGCGAA 100 CAGCCGCAGC CGGTAATCGCCGCGCCAATC AGCGAGCCTG CTGACGGAGC 150 GGCCCACATT ACCGCCAGTC CGGTCCCTCTACCAGTAGTG AAAAGGTTGC 200 ACCGTGCGCG CGTAACGGTC GGGAAATTTG GCGCAGAAAAGCGGACAGCG 250 ATAAACGCAT CAACACTATG AAACGGTGAT ACAGTAGTGT GACAGAGTGT300 ATCTAGTGAC ATCTGACAAC TTCTCTCAGC 330 223 base pairs nucleic acidsingle linear NO YES DNA (other) 217 GATCTGGGCG AAATCGCGCG GAGTCTGGCGGCGGGCGATA TCATTACCCA 50 CTGTTACAAC GGTAAGCCGA ACCGTATCTT CGGCCTGACGGCGAGCTGCG 100 GCCTCGGTGA CACGAGCGCT GGCCGGCGGC GAGGCTATGG AGTCGGCATG150 GTACCGCCAG TCCTGAGCTT TGCGTGGCTA ACTCGCTATA GCTGGATTTA 200CCGCATACAT CAGTCGATAT CTC 223 316 base pairs nucleic acid single linearNO YES DNA (other) 218 GATCGCCACC GTTTTGTGAT GCGCGCCAAT TTGGGCTGGATAGAAACCGG 50 TGATTTCGAC AAAGTTCCGC CGGATTTACG TTTCTTCGCC GGGGGGACCG 100CAGTATTCGC GGCTATAAAT ACAAATCTAT TTCGCCTAAA GATAGCGACG 150 GCAATCTTAAAGGCGCCTCA AAACTGGCAA CCGGATCGCT GGAGTACCAG 200 TATAACGTCA CCGGTAAATGGTGGGGGCAG TGTTTGTCGA TAGCGCGAGC 250 GTGAGTGATA TCGCGTAGCA TTCAAACCGGACGCCGACCG ACCGACCGTG 300 GCTTCAACCT ATTCAC 316 182 base pairs nucleicacid single linear NO YES DNA (other) 219 GATCTGGGGT GGGGGATTGTTGATGGTGTG TGGAGCGCTG CTGAGCGGAT 50 GGCGGGGGAG GAAGCATCCT GAGTTATTGCCTGATGGCGC TGCGCTTATC 100 AGGCCTACGA GTGAAAAGCA TGGTAGGCCG GATAAGGCGTTCACCGCATC 150 CCGAAAACGA TGTTACTTTT GGCTTTACTG AT 182 419 base pairsnucleic acid single linear NO YES DNA (other) 220 TGCAGATCAA AACAGCGACGGCTGGCAAAA GCGGTAAAGG TTTACGACCG 50 GTCAGCGCCC CAGCCGCCGC CGTGCCAATCACATTCGCCT CCATAATACC 100 GCAGTTAATG ACATGCTGCG GGTAGTCACG CGCCACGCTGTCATCGCCAT 150 TGAGCTCATT AATCAGCCTC AGGATATGGC TTCAGCCTCA AGCGCAATAA200 TTGGGCTTCC GGCCTCAATC TGCCCGGCGA TAAAACCGGC GTAAACTTTG 250CGCATTTCGA TATCGTCTTT AAGCCCTGGG AAGCTTAATC ATGCATGACC 300 TCCAGTTGATGAATGGCCTC ATTGAACGTT GCTTATCGCA TCGTCAGCGT 350 AAGTGGTGAG AATTCGTTAACTGCTCAGGC ATGCACCCTG CCTTATGCTG 400 TCAAGGATCA CACCGTGCT 419 126 basepairs nucleic acid single linear NO YES DNA (other) 221 GATCTTATGACATTGTGAGT ATCCATCGCT TTTTGTACTG AGCTGTAGGC 50 AACTCCGACA GCTTTTGCTCAGCAGCTGTT GTTTCTCATA AGCTAGTGAC 100 CAAGCTGCTG CTACCACAGG TCTGGG 126192 base pairs nucleic acid single linear NO YES DNA (other) 222GATCCTGCAC GCACGGGCGC ACAGCACCGA CAAGCTGTCC AGCTACTTGA 50 CACAGCGCCAGCGCGTGCTA GCGAGCGAAC CCGCAGGTGG CACATGGCGG 100 GGACGGCGAG CAGGAGACAGGCTAGAACGC TTTATGTGCG CACTATGCTA 150 TCAAATAGGC CGTCCGGCTG CACGCCGACACTACCCTGAC AA 192 331 base pairs nucleic acid single linear NO YES DNA(other) 223 GATCACCGCA TCGCGAACTG GTTACGGGCC TGTGGAGCGT ATTTTTTGAT 50GTTATTGGTA TTCATAGAAA ATCCTGCAAA GGGCAGCAGA GCGCTGCCCT 100 GAAATGGGGGTTACTGAAGA CGAATCCGGT CACCTGCCTC AATAGCTGCC 150 AGCAGCGAAG TACGAAGCGTATCCAGCGCT TTTTCCACCT GTTCGGCGGT 200 TTCCAGCACT TCGCCACCGG TGGCTTTGCGCATCTCGCTG GCGACATTCA 250 CCAGATGCGT TTTTTCGGTA CCGGTTGGAT AACGGTTCTCTACCACAACA 300 TAAGCTCGTT GTGACTCGGC GCCTTAGCTT A 331 410 base pairsnucleic acid single linear NO YES DNA (other) 224 GATCTAACGT ATCACGACTAAACGTAAGGG TAAAGCGGCT GGCGTATCGT 50 CCGGGCATAA AGTCATATCG CCTGAACAGATAACATCTCA CTGACTTTGA 100 AACGCGATTT TATAATTTGC TGCCCAAAAA TACGTGGCGCTGAAAGGCGC 150 ATTTTTGATG CAAATCATTT ATTACTGTGA TAACACTGCG CGCGATAAAA200 CATTAATATA TTCACATAGT AATATGTTCT ATTGGAATGG TTGTTTCGAT 250ATGACAAAGT CTAAAAAACC ATTGATGTGA AAAGGAATAA GAATTGTCTA 300 TATTCCGATTCGGTGGAATT AAGTATTCTC GGATAAAATA GAATGATATT 350 GATATTCTTT TGATATGGTCTATAGCGCTA TGTATCAGAC GCGTGATCGT 400 CGGAGATCAG 410 185 base pairsnucleic acid single linear NO YES DNA (other) 225 GATCTTCGAC TGCCGCGCTTCCGCGACAGC GACATACGGG TGTTCTTTGT 50 CGGTGACGTT TATCCGTTGT CGTGACCTTCATCCGGTGGT GAAACCTGAG 100 CCGAATAATA CTGTACACCA CCACCAGGAC AGAATACTCAAACCACGTTC 150 ATGTGATTGT TGCACCACAT ATTCATTGTT GGAAC 185 276 base pairsnucleic acid single linear NO YES DNA (other) 226 GATCCGCTGA CAGATGTCGTGTACAGCATT CTTTAGAGTG GAACGGTGAC 50 CGTACCGCAA AGCTGTGAAA TCAACGCCGGACAAACGATT CTGGTAAATT 100 TCGGCGCATT ATACAGCGGC AATTTCAACC ATGCAGGCCAAAAGCCGGAG 150 GGGGTACGAG CGAAAAAATT CAGTCGCTTC CGGTAAAGTG CAGCGGTCTG200 GATTCGCAGG TCAATTTAAC AATGCGTCTT ATCGCTCCGC GGATAGCACG 250TCCAGCTATC GCTCGATATG CGATGT 276 383 base pairs nucleic acid singlelinear NO YES DNA (other) 227 GATCACCGAC CGGACGGTCC GTACCTGGATTGGGGAGGCG GTTGAGTCCG 50 CAGCGGCTGA CGACGTGACG TTCTCAGACC CGGTGACACCCCATACTTCC 100 GCCACTCCTA TGCGATGCAC ATGCTGTACG CGGCATACCG CTGAAGGTGC150 TGCAGGCGCT GATGGGACAC AAATCGGTGA GCCTGACGAG TGTACCGAAA 200GTGTTTGCGC TTGATGTTGC CGCACGACAC CGGGTGCAGT TTCAGATGCC 250 GGGTGCTGATGCAGTGGCTA TGCTCAAAGG AGGTTCATAG AGACGTGTAT 300 GCATTTTCAG CTTCGCTGCACAGCATCGAA CGGAGTTTAC GCGTTTATCA 350 GCCATGTCTG CGCACAGAGG AGTGTGCTCGAAA 383 357 base pairs nucleic acid single linear NO YES DNA (other) 228ACTTGCCGGT AATTTCCATC CCTTCCAGCA CCGCCATCTC TTTACCCTCA 50 ATGGCGATGGACAGTTTATC CAGCGTTAAC TTTTGGTCGC CCCACGTTCG 100 CCAAAGCTTG CCAGTTTACTGGTACCGTCG GTTTTCAAAT TATTAAAGGT 150 GAGTTGGACC TTCTGATTAT ATTCGTTAACGGCATCGACC AGGCCGCTCT 200 CGCGCTTCGC CTGACAGCGA AACCACATTA CCGTCTTTATCGGGCGTTAA 250 CGGGAACTCG GCGCCGCTAA AGGCACCTTT ACCGGCATTC TCTGAGTTAA300 CCGGCTTGAG AGAGATATCG GAGCGGTATC GCCGCCATAC ATGCGGTATT 350 GATACAA357 225 base pairs nucleic acid single linear NO YES DNA (other) 229GATCTATTTC GGACAGCCAA AAGGCCGTGA AGGCAGCGGT CAGTACAAAA 50 AGCCTTTGATACCGAAGTTT ATCACCGGCT TTGAGATCGA GCGCAGTTGC 100 CCGTATGCCT TTGAATCGGCGCGTTAAACC GGCCGTAAAG TACCCTCTAT 150 TGATAAAGCC AACTACTGCA AGCTCTATCTGTGGCGTGAA TACGTCAATA 200 GTGGAAAACG TATCCGATGT GAACT 225 275 base pairsnucleic acid single linear NO YES DNA (other) 230 GATCGTTAAA CAGATTGACCAGTTCGCCAC ACTCTTCCAG ATTAAACCCC 50 ACCTGCCTCG CCTGTCGCAG CAACGTCAGCTCGTTTAAAT GCTTCTGCGT 100 GTAGGTGCGA TAACCATTTT CGCTACGTAA TGGCGGCGTCACCAGCCCTT 150 TCTCTTCATA AAACCGAATG GCTTTGTGGT TAGCGTTTTG GCACATCGCT200 ATATCATATT GCCCTGCCTA CTGCTGAGTT ACTATACGGG TACTACGTCT 250AGAGATCGCG AAAAGGTTAC AGTAC 275 233 base pairs nucleic acid singlelinear NO YES DNA (other) 231 GATCGACGTC GCCTGATTTA AGACCCGCAAGCAACATCGT ATTGTTCATG 50 GTCGCGACCT GTAACGAGGT CGATTTTTGC TGTTGATGGAACCGCCCAAT 100 AGCCGCCGGG AGTATACCCA GCGCAGGTGG GGAGCGGCAA CACGCACCAT150 CGGCGCTAGC TCCTCTTTGG CGATTCGATC GGATCCTGGC GGTGGTATTC 200ATGATCTAAT CCTTTTATCG ATGAGTAAAA TTG 233 358 base pairs nucleic acidsingle linear NO YES DNA (other) 232 GATCGGCGGA GAATCCCAGA CAGGCCAGGTCTTTCAGCTC GTCGCGGGTC 50 ATCGGGCCGG TAGTATCCTG AGAACCGACG GAGGTCATTTTCGGTTCGCA 100 GTACGCGCCC GGACGGATAC CTTTCACACC ACAGGCGCGA CCGACCATTT150 TCTGTGCCAG CGAGAAGCCA CGGCTGCTTT CCGCCACGTC TTTTGCTGAC 200GGAAAACGTC TGAGTGCGCA GCCAGCGCTT CACGCTTTTG TGCTAGCACG 250 CGATATCACGATACACACGC ACGACTCGTC ATCAGCACGT CGTTCAGTCG 300 AGTGCAGTAG CGCGTCATGATGCGTACTGC TTGACGTAGA CTATCATGCC 350 ATATCAGT 358 302 base pairs nucleicacid single linear NO YES DNA (other) 233 GATCCACAGG TAGCGTGATGCGTTTTAGTT CCCCCTGCTG CTCAAGTAGC 50 GTCAGGCCGT CGCGTAAATC GTGATATTTCATGGCGTCCA TTGTAGCCTC 100 TTGGTAAGCG CATCATTATA CGGCGTTCAT CATCGGGATGCTGTATTTTT 150 GTTAAATTAG CGTGAACTCT GGCAACCAAC GCTAATCCAG ATACGGCTTA200 AAGGATGAAG TGTATATTAA CTTCGCGCAT GGCTTTTGCT ATGCTTGCGC 250CCCGAACAGC GATAAGAGTC ATATGCATCT GGTATTTACT GTACTGCAAA 300 CG 302 374base pairs nucleic acid single linear NO YES DNA (other) 234 GATCGTCACCTCCACCCTCG CGCGCGGGGC GGTGAAGCTC TCGAAACAGA 50 AAGTTATCGT GAAGCACCTTGATGCGATTC AGAACTTCGG CGCGATGGAT 100 ATCTGTGCAC TGATAAAACC GGCAACTCTGACGCAGGATA AAATTGTGCT 150 GGAGAAATCA CACGATATTT CTGGTAAGCC CAGCGAGCATGTCTGCATTG 200 CGCCTTGCTG ACACATTATC AGACCGTCTA AAAAAATTTC TGATACGCGT250 CTGAGAGTAG ACAACGCGGT CACCTCGACG TGCAGAAAAT CGATAGATCC 300GTTATTTAGC GTGCGATGTC GTAGTGTGCG AGATCGACGT GCATCAGCTG 350 GATCTGCAAGCTAACGAGAC TCAC 374 355 base pairs nucleic acid single linear NO YES DNA(other) 235 GATCGGACTT TATTCGCGCG ATAGTCACGG AAAAAATGGT TTAACTTTGC 50TAATTCATCC TGAATGTAGG CTCTTCCATC GAAAAACTCC GCCTTGATTG 100 ACTCTCCGGTATGGAGATTG TTTAACGTCA AAAATGCGCG CCGTGGGGTC 150 GAGAGTGTGG CAAACGCTGAGCGCGGGCAG GATGGCGGCG CGAGAGCGAC 200 ACCACCAAGC GCCAGAGCTT GCGCGATTAGCGTCAAATTT GTCATGATAA 250 TCAGGTCTAC AGGTCAATGT TATCGTTAAT ACACTTCTACCTTTAAGCAG 300 ACATGATACG CTGACACGAC TCTACGCGTG ATAGTGTGAT ACTTGGCACA350 GACTA 355 363 base pairs nucleic acid single linear NO YES DNA(other) 236 GATCGTCACG TGATTTGCCC GTCACGCGAA TCTCTTCCCC CTGAATTTGC 50GCCTGCACCT TCAGTTTGCT GTCTTTAATC AGCTTGACGA TTTTCTTCTG 100 CACGGCGCTTTCAATGCCCT GCTTCAGCTT CGCTTCCACA TACCAGGTTT 150 TACCGCTATG CACGAACTCGTCCGGTACAT CCAGCGAAGC GCTTCAATAC 200 CGCGTTTAAG CAGCTTGGCG CGCAGAATATCGAGCAACTG ATTGACCTGG 250 AAATCGGACT CGCTCAGCAC TTGATGGTTT ATTGGCATCGTTCAGTTCAT 300 AGTGCTCTAC GCACGGAGTC AAACAGACTC ACTGGAGCTA TCACACGTAC350 GCGCTCTCGA GAT 363 320 base pairs nucleic acid single linear NO YESDNA (other) 237 GATCGTTAAT TAGGCGCTGG GCGTGCTGGA GCAGTAATTT ACCGCCTTCC50 GAGGGGCGTA GTCCTTTACT GTGGCGCTCA AAAAGCGTGA TGCCTATCTC 100 ATCTTCGAGTTGAGATAGCC ACTTCGATAG CGCCGCCTGG GAGATATTCA 150 TCATCCGGGC GACGTGTCCGTTCAAGGGTT GGCCCTGTTC GGCCCAGCGC 200 AACCAGCGTT TGCGGTGATG TAATTTCAATTTCTCCCGTT CCATTCGCTA 250 TAACCTCAGG TTATGTCTCT CCTGAAACCA TTGTACTTTATCCTCCTCTA 300 CACTCGTACT GCACTAACAC 320 406 base pairs nucleic acidsingle linear NO YES DNA (other) 238 GATCCTGCAA CGCTTTCGAC CCGGTCGAAATAATGACTTT TTTCCCGGCG 50 CGCAACGCCG AGCGAGGTAA GCATAGGTCT TCCCGGTTCCGGTGCCGGCT 100 TCAACAACCA GAGGCTGCGC ATTTTCAATC GCTTGTGTTA CGGCAACCGC150 CATTTGTCGC TGTGGTTCGC GCGGTTTAAA GCCGGTTATC GCTTTGGCCA 200GTTGGCATCT GCTGCAAAAT CGTCCGTCAC ACTGCCCCCT GTTAATTTGC 250 ACAGGGATTATGTCAGGGTA GAAAGGCTTA CACAGTTACA GAGGTGACGG 300 CGGCACATTG TGCAGTCTTGAACCATTCAA ATGAAAAGCA AATGAGGAAT 350 AAGTAATGTC TATCGTGCGT ATGATGCGAGATCGTGTCAG ACGTGTGACT 400 CAATAT 406 263 base pairs nucleic acid singlelinear NO YES DNA (other) 239 GATCCTACCG GCCCCCACGC TTTGATTTGAATAATAGAGG CTACCGACGA 50 CAGCGACATG CTGATAATGT GCTGCGTATC CTGCGCCGGTAAACCCAACG 100 CCTGGCAGAT TAACAGCGCT GGCTGATTAC CGCGACAAAC ATGCCACGAG150 ATGCTGACAA GCGCAAAAGG TTGAGGAGCG CGGCGATCTT CAAGACGGTA 200AATTAATCGC TGCACAATTG TACGCGACGA TGCATCTCGC ATGCGTCTAC 250 GACATAGACATCT 263 364 base pairs nucleic acid single linear NO YES DNA (other) 240GATCAACGCC TAATTTGGCC GCACAATCCA GAGAGACCTG CGGGTGCGGT 50 TTGCTGTAGGGCAATTTTTC TGCAGAAGCC AGCGCGTCAA AACTGTCGCG 100 CAGTTCAAAC ATGGTGAGCACTTTTTCCAG CATATGCAGC GGCGATGCCG 150 AGGCAAGCCC CACTAATAGC CCCTGCGCTTTACACAGCGC CACAGCTTCG 200 CGCACACCCG GCAAAAGAGG GCGCTCTCTT TCGATAAGCGTAATCGCGCG 250 GGCAATAACA CGGTTTGTCA CTTCTGGCGA TCGGGCGTTC ACGTTGCTGC300 GCAACAGAGA TCGACAACCA TATCATGCGT AGCAAGCTGT TGCAGCTCAT 350GGCCGAGTAT ATCT 364 221 base pairs nucleic acid single linear NO YES DNA(other) 241 GATCATTTTA ATGCTGTGTC TTGCCATTTT TTTCTCCATA AATTTCAAAA 50GGAAATCATG CCTGATGCGC ATTGCGACGG CGTGAGTACC ATTCAAGGAT 100 TTGGTGACGATGCAAACTGA TGGAACGACC AACGACAACA ACAATGAGAA 150 GCGCACCGGA CAATGCGCTGGAATTGATTC GGCACTCCGG CCATCTGTAG 200 CCCTCGTGTA AATCCACCAG C 221 280base pairs nucleic acid single linear NO YES DNA (other) 242 GATCATCGACGTATGTCCTT TCCAGATATT CCGCCCGCCG CCAGCCCACT 50 CAAACAACGG GGGGCGCCGGCAAAAAAGCG AAAGACATCC ACCGATTGCC 100 GGAATTTATA TTAATTACGC CAGTGCAAAGGCTTATTGCA GTTTTGCGAT 150 TCAAGCCGGG CGAACTCAAG GGCGTTTTGC TCGATGCTGTCCGCAGTTTT 200 AACAGACATT CCGCCCGTGC TTTGGGTGTG GTCTGCCCAT TCGGAAACGC250 GTTATCGGCG GCTGATCGCA GCGTAACCTG 280 277 base pairs nucleic acidsingle linear NO YES DNA (other) 243 CACTATAACA ACGGCGCGGC GGTACCTGGGCGACGTCGCC AGCGTCACCG 50 ACTCGGTGCA GGATGTCCGT AACGCCGGGA TGACGAACGCTAAACCCGCT 100 ATTTTGTTGA TGATCCGCAA GCTGCCGGAG TGGAATTCCA CATGTGGAAT150 TCCCATGTCA GCCGTTAAGT GTTCCTGTGT CACTCAAAAT TGCTTTGAGA 200GGCTCTAAGG GTTCTCAGTG CGTTACATCC CTAAGCTTGT TGTCACAACC 250 GTAACTAAACTTAAACCTAT ATATCCT 277 380 base pairs nucleic acid single linear NO YESDNA (other) 244 TGCAGATCAT TGCCTGATGT TCTACGGTCG CAAAATGCAC CAGNNNNCAG50 AACAACGACA GCGACAACAA TACGGCTGAA GCGCTTTAAT CGCGCTAACT 100 CCTTTTTCTCAAAGCCCCTT TCCGTTCACC TGCTATAGCG TNGAGGGGCC 150 CACTTACCAG GAACAAGACTATGAACGTTA TTGCTATCAT GAACCACATG 200 GGCGTCTACT TTAAAGAAGA GCCTATTCGTGAACTGCATC GTGCACTGGA 250 AGGTTTAAAT TTCGTATCGT CTATCAAAAC GACCGAGAAGACCTGCTGAA 300 GCTGATTGAA ATAACTCCGC CTTTNNGTCA TTTCGACTGG GATAATATAC350 CTTGAGCTTC GAGAGAGATA GCAGTGAGCG 380 353 base pairs nucleic acidsingle linear NO YES DNA (other) 245 GATCTGATTA TCGACGCGCT GCTTGGCACCGGCATAGCCC AGGCACCGCG 50 CGACCCGGTA GCCGGTCTGA TTGAACAGGC GAACGCATCCTGCGCCGGTT 100 GTCGCCGTCG ATATCCCGTC AGGTCTGCTG GCGCAAACGG GCGCACGCCT150 GGCGGGTGAT AAGCGCGCGC ATACGGTCAC GTTTATCGCC CTGAAACCAG 200GCCTGCTGAC CGGCAAAGTG CGTGAGCTTA CCGGCATATT GCATTATGAC 250 GTTGGGACTGGAAGGCTGGC TGGCGAGCAG ACGCGCGTCG GTTTTGAAGA 300 GAGTTGGGGC AATGGCTAACGCGTGACGAC TGATAGGGAT ATGTGTAGAT 350 ATG 353 376 base pairs nucleic acidsingle linear NO YES DNA (other) 246 CACCCGGCTG ACTGCCGTAT AATCCAGCTTTTTACGCGGG TCCGCGGAGG 50 GTTTTGCCGT CACAGAGAGC GTATTCTGCG AGTTTATGGTTGTCTTACCT 100 AACGGATAGC CTTCGCTATC ATAGCGGTAC TCGACCCTTC ATCTCTTTGC150 CCGTCGCCGA TACCACAAAA CCGTTGTCGT CCGTTTCCCA GGTCACGCCC 200GCCGAACGAA CGCCGCCAGC TGGCACTTCC CCTGTAACTG CACCTTTTTT 250 TCCAGCGTCTGAGCATCCCG GTAATAATTG GCATCCAGCA CGAGTGCCAG 300 CCCCGTATTT ATCTCCAGATCGTGTAACTC AAGCGTATCA AAACAGCCTT 350 CCTGTGAAAG CGTACCGCGA CCTCTA 376248 base pairs nucleic acid single linear NO YES DNA (other) 247GATCAAGACG CGAATCCCCG ACGCGCCGAT AACGCCGTAC AACAGCAGCG 50 AGACGCCGCCCATCACGGGT AACGGGATAA TCTGAATCGC CGCCGCCAGT 100 TTGCCAACGC AGGAAAGCATGTAATAACGA AAATCGCGTC GCCGCCGATA 150 ACCCAGGTAC TGTAAACGTC GGTGATCGCCATGACGCCAA TATTTTCCAT 200 AGTGTATCGG CGTGAGTAGA ACCGAATATC GTCGACATCTAGCACATC 248 253 base pairs nucleic acid single linear NO YES DNA(other) 248 TTTCGACAAA GCGCGCCGCC GAGATATTCG CCATGATCAT GCACTCTTCG 50ATAAGCTTAT GCGCGTCATT ACGCTGGGTC TGTTCGATAC GCTCAATGCG 100 ACGTTCGGCGTTAAAGATAA ACTTCGCCTC TTCGCACACA AACGAGATCC 150 CCCCGCGCTC TTCACGCGCTTTATCCAGCA CTTTGTAGAG GTTGTGCAGC 200 TCTTCAATAT GCTTCACAGC GCGCATATGTCACGCAGATC TGATCGCTGC 250 AGC 253 414 base pairs nucleic acid singlelinear NO YES DNA (other) 249 GATCAAACAC CAGACGACCG CGACGCGCACGACCATCGGT GGTATCTAAC 50 TCAAATTTCA TTATCACTCC TGCGTCAGAA AAACAGTCCGACGTTTAACG 100 ACTCGCTACG GAATGATTCC ATAGCTAATA AATTCCCGAA GACGTCATCG150 GCGCAGAGTT TGGGGTCGAC CAGCGCACAG CCACCGGAGC GTACACGCAG 200TACGTGAGGA TGGCGAGCAC TGCCGCGTCA AATGCAGTGA GATAGCTCTA 250 CGACGTCAGAATAGCTGCGA TGTACGTGAT AACTGCTCCG TAGCTAAAAG 300 CATTTGTCTA CGCAGTCTATAGGCATCATG TGTGTGATAC GCATGCGAAC 350 AGCATACACG TGATCGCAGA TGAGTGTGATCAGGCATATA CTGACGAACT 400 GATATAGATT CGTG 414 112 base pairs nucleicacid single linear NO YES DNA (other) 250 GATCTTCCGG GTTCACGGCCACGCGGTAAT TCTGCCGAGA ATAGTTTTCG 50 GGCGGGTGGT GGCGACAACC AGAAATCTTACCGTCGCGGT TTTCGCGCCG 100 TCGGCCAGCG GA 112 345 base pairs nucleic acidsingle linear NO YES DNA (other) 251 GATCGTTAAA TGTGCGGTAA TCCTGTGATGAATACCGATA CGCAGCCAGA 50 CCAAACCGAG TTAATGTTTG GGTCAGGTAT TTATTATAAGCAATCTGATA 100 ACTCTGACCA TCAAATACGA CGCCATTATC CTGTTTACTG TGCGCTCGCG150 TAGCTCAAGC GAAATGGCGC CAATCCGGGT ATTCCACCCC GTGCCGAGGG 200TAAACGCATT ATAATGGTTC GATAGCATCG TACGCATAAG CGTCAACAGG 250 TTATTAGGCATACTGATACT GATTGGTAAA TCGGCTGATA TCGGCGCTTC 300 AATTATGACT ACGCGCGAAATCATACTGAG CCGTCCAGTC CATTC 345 203 base pairs nucleic acid singlelinear NO YES DNA (other) 252 GATCGGTCGC CGCCTTACCT TTTTCCAGTACACTGAGCAG TTCGCTCAGC 50 AGTTGTTCAA CAGCTCCATC ACTAGAGCGG GAGAGTTCTGGCATAAATCA 100 AAATCTGTTT GTTCATGAAA CGGCAACACA TTAACCGCAG CAACAGTTTT150 TTTCTGCATT TTTCGGCCTA AATCATCGCC TTACGATACT CTGAATACAG 200 GGG 203273 base pairs nucleic acid single linear NO YES DNA (other) 253GATCGTAATC ATTCACTTCG GTCAGCAGCT CGAGCACTAA CGCGTCGAGC 50 ACGCCTTCCATCGGCGCCAG TAAAACACGC ATATCGGTAT CCACAGCAAA 100 AAAAGAGGCG CTATCATAACGCCTCTCTGC GATGAGCAAA ACTTTTTTGC 150 CGGGTGGCGG CGCAAACGCA CGCTACGTACGTAAGTGCTC ACGCGGCTTC 200 AAGACCAGTT ATTTTTCCAG CCGACCAGCC ATTCGAACCGCGATAAGCTC 250 TGCGATCCTT TCCAAGTATG CTG 273 154 base pairs nucleic acidsingle linear NO YES DNA (other) 254 GATCTTCTCG CTTTCTTCAG GGCTTACTCCCGTCTCTTCT TCATCGACCG 50 TGATCAAAAT ACCGTCTTTA TCCACCAAGA AGCCGACTTCAATCTTCGTA 100 TGAAAATAGC TCACCATTAC GAACTATATT TTTCATCTCT CTTTCCAGCT150 TTTT 154 348 base pairs nucleic acid single linear NO YES DNA(other) 255 CGCTGTTCTG GTGTTAAGAC TTTGCTTAAA TCAAAATAAT ATTTAACCCG 50ATAATAGCGA GCCTGTTGTT CTATGTTACT GAAGGCTGCA AGCTGCTGTT 100 TTACGGCGGCGTCATCCCAT TTACCGGATT TAATCACCTC TATCAGCGCA 150 CCGTCTTTAA TTCCCTTCATAGAAATCTGA CTGACGTCGG TTTCCAGTTG 200 TTGGTGAAGT TTTTTGATCC GGGTAATCTGATCGTTTGTC AGCTTCAGAT 250 GCTGGACAAT AGGATCCTGG GCGGGCAGGG GGAGGATTGGGGACAGCGTG 300 CAAGCAAAAG AAACGCGCAG AGTCGCTGCA GTAAGTGGGC ATACGTTT 348

What is claimed is:
 1. A method of identifying microbial codingsequences which are specifically induced in a pathogenic microorganismduring infection of a host, comprising the steps of: (a) providing a oneto four kilobase fragment of size fractionated chromosomal DNA thatshares homology to a genomic DNA of the pathogenic microorganism; (b)infecting a host with a pool of fusion strains wherein said fusionstrains are constructed by integrating an expression plasmid into thegenomic DNA of the pathogenic microorganism that is either: (i) anauxotrophic mutant strain of said pathogenic microorganism that lacks atransposition competent element, or (ii) sensitive to an antibiotic,wherein said expression plasmid comprises: A) a promoterless syntheticoperon comprising two genes, wherein the first gene complements themutation of the pathogenic microorganism or confers resistance to saidantibiotic, and the second gene functions as a reporter gene, and B) theone to four kilobase fragment of size fractionated chromosomal DNA thatshares homology to said genomic DNA of the pathogenic microorganism; (c)treating said host with said antibiotic if said first gene of saidsynthetic operon confers resistance to said antibiotic; (d) harvestingfrom said host the fusion strains that survive and propagate in saidhost after step (a); and (e) detecting expression of said one to fourkilobase fragment by identifying harvested fusion strains from step (d)that fail to express said reporter gene in vitro.
 2. The method of claim1, wherein said pathogenic microorganism is sensitive tochloramphenicol, and said first gene expresses chloramphenicol acetyltransferase.
 3. The method of claim 1, wherein said auxotrophic mutantstrain is deficient for adenosine 5′-monophosphate and said first geneexpresses adenosine 5′-monophosphate.
 4. The method of claim 1, whereinsaid auxotrophic mutant strain is deficient for thymidylate synthetaseand said first gene expresses thymidylate synthetase.
 5. The method ofclaim 1, wherein said second gene encodes a protein, the expression ofwhich is assessable in vitro.
 6. The method of claim 5, wherein saidsecond gene is selected from the group comprised of lacZY codingsequence, a luciferase coding sequence, and a human growth hormonecoding sequence.
 7. The method of claim 1, wherein said pathogenicmicroorganism is a bacterium.
 8. The method of claim 1, wherein said oneto four kilobase fragment comprises a sequence that induces expressionof said promoterless synthetic operon.
 9. A method of identifyingmicrobial coding sequences according to claim 1, further comprising thesteps of: (f) sequencing the one to four kilobase fragments from step(e); (g) identifying one or more aberrant fragments from step (f) havinga total guanine and cytosine content of less than about 49% or greaterthan about 59%; and (h) detecting and identifying a microorganism byhybridizing one or more of said aberrant fragments to genomic DNAderived from said microorganism.
 10. A method of identifying microbialcoding sequences according to claim 1, further comprising the steps of:(f) sequencing the one to four kilobase fragments from step (e); (g)identifying one or more aberrant fragments from step (f) having a totalguanine and cytosine content of less than 49% or greater than 59%; and(h) detecting and identifying a microorganism by hybridizing one or moreof said aberrant fragments to genomic DNA derived from saidmicroorganism.
 11. A method of identifying microbial coding sequenceswhich are specifically induced in a pathogenic microorganism duringinfection of a host and which may be used as probes to detect andidentify pathogens, comprising the steps of: (a) providing a one to fourkilobase fragment of size fractionated chromosomal DNA that shareshomology to a genomic DNA of the pathogenic microorganism; (b) infectinga host with a pool of fusion strains wherein said fusion strains areconstructed by integrating an expression plasmid into the genomic DNA ofa pathogenic microorganism that is either: (i) an auxotrophic mutantstrain of said pathogenic microorganism that lacks a transpositioncompetent element, or (ii) sensitive to an antibiotic, and wherein saidexpression plasmid comprises: A) a promoterless synthetic operoncomprising two genes, wherein the first gene complements the mutation ofthe pathogenic microorganism or confers resistance to said antibiotic,and the second gene functions as a reporter gene, and B) a one to fourkilobase fragment of chromosomal DNA that shares homology to saidgenomic DNA of the pathogenic microorganism; (c) treating said host withsaid antibiotic if said first gene of said synthetic operon confersresistance to said antibiotic; (d) harvesting from said host said fusionstrains that survive and propagate in said host after step (b); (e)detecting expression of said one to four kilobase fragments byidentifying harvested fusion strains from step (d) that fail to expresssaid reporter gene in vitro; (f) sequencing the one to four kilobasefragments from step (e); (g) identifying one or more aberrant fragmentsfrom step (f) having a total guanine and cytosine content of less thanabout 49% or greater than about 59%; and (h) detecting and identifying amicroorganism by hybridizing one or more of said aberrant fragments togenomic DNA derived from said microorganism.
 12. A method of identifyingmicrobial coding sequences according to claim 11, wherein the one ormore aberrant fragments from step (f) have a total guanine and cytosinecontent of less than 49% or greater than 59%.
 13. In a method ofidentifying microbial coding sequences which are specifically induced ina pathogenic microorganism during infection of a host, wherein: (a) ahost is infected with a pool of fusion strains wherein said fusionstrains are constructed by integrating an expression plasmid into thegenomic DNA of a pathogenic microorganism that is either: (i) anauxotrophic mutant strain of said pathogenic microorganism, or (ii)sensitive to an antibiotic, and wherein said expression plasmidcomprises: A) a promoterless synthetic operon comprising two genes,wherein the first gene complements the mutation of the pathogenicmicroorganism or confers resistance to said antibiotic, and the secondgene functions as a reporter gene, and B) a fragment of chromosomal DNAthat shares homology to said genomic DNA of the pathogenicmicroorganism; (b) treating said host with said antibiotic if said firstgene of said synthetic operon confers resistance to said antibiotic; (c)harvesting from said host fusion strains that survive and propagate insaid host after step (a); and (d) detecting expression of said fragmentby identifying harvested fusion strains from step (c) that fail toexpress said reporter gene in vitro, wherein the improvement comprises:providing a one to four kilobase fragment of size fractionatedchromosomal DNA that shares homology to a genomic DNA of the pathogenicmicroorganism; and selecting an auxotrophic mutant strain of saidpathogenic microorganism which lacks a transportation competent element.